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Judicial interference with mifepristone 


n the days since Texas federal judge Matthew J. 
Kacsmaryk invalidated the approval by the US Food 
and Drug Administration (FDA) of mifepristone, a 
medication used to terminate pregnancy, a shock 
wave of concern has swept through many people, 
organizations, and companies that work closely 
with the agency. The strong opposition reflects the 
high stakes not only for pregnant persons and for the 
FDA, but also for the scientific process of drug devel- 
opment and public access to safe and effective medica- 
tions. Twists and turns in the case are already happen- 
ing. A federal appeals court stayed the full suspension 
of mifepristone, but permitted multiple restrictions on 
its availability. Then the Supreme Court, which recently 
overturned the constitutional right to abortion, kept 
the status quo in place for a few days 
while considering the government’s 
appeal. The results of the legal battle 
will be enormously consequential 
for reproductive health care—and 
far beyond, for innovation, science, 
and health. 

The FDA plays such an important 
role in the health of Americans that 
it is easy to take its functions for 
granted. More than 15,000 agency 
employees regulate an estimated 
$2.7 trillion in consumer goods, in- 
cluding all medical products. Over 
more than a century, the FDA has 
developed extensive processes that govern the col- 
lection and review of preclinical and clinical data on 
biologics and drugs with defined scientific and legal- 
regulatory standards, earning high levels of trust from 
the public in the process. 

The agency’s review of mifepristone in 2000 was 
thorough and fair. The drug’s manufacturer submitted 
a large dataset for the agency’s experts to review. An 
external advisory panel supported its approval. After 
a 6-month review, FDA’s scientific staff concluded that 
mifepristone is safe and effective. Over the past two de- 
cades, the medication’s safety record has grown stron- 
ger, with major medical professional associations in full 
support of access. Over time, the FDA, after thorough 
safety reviews, loosened restrictions on distribution. 

The FDA’s expertise and diligence, however, barely 
seemed to matter to Kacsmaryk in his unprecedented 
decision last week. The judge’s use of extreme rhetoric, 
reliance on noncredible sources, and tendentious rea- 
soning may have raised the hopes of the plaintiffs, who 
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have a strong ideological opposition to abortion, but the 
decision also shredded any pretense of judicial objec- 
tivity and lowered the bar for efforts to overturn well- 
considered and justified determinations by the FDA. 

It’s no surprise that a broad diversity of patient ad- 
vocacy groups have expressed alarm. “The implica- 
tions of this ruling go far beyond mifepristone,” read 
a statement of opposition signed by 30 organizations 
representing those with serious health conditions. They 
warned, “If this judge’s ruling is allowed to stand, pa- 
tients may no longer have the security of knowing that 
determinations about drug safety ultimately lie with the 
experts.” Also objecting are pharmaceutical companies 
that recognize the threat to innovation posed by arbi- 
trary judicial orders. The interim director of the biotech 
organization BIO described the deci- 
sion as “an assault on science” while 
the pharmaceutical organization 
PhRMA expressed “serious concerns 
with any court substituting its opin- 
ion for the FDA’s expert approval 
decision-making.” 

In an amicus brief to the US Court 
of Appeals for the Fifth Circuit, many 
leading companies and trade organi- 
zations declared that if upheld, the 
decision could “empower any plain- 
tiff to grind drug approvals to a halt, 
disrupting patients’ access to critical 
medicines.” The brief also noted the 
potential to “wreak havoc on drug development and ap- 
proval generally, causing widespread harm to patients, 
providers, and the entire pharmaceutical industry.” 

As the former commissioner and principal deputy 
commissioner of the FDA, we couldn’t have said it bet- 
ter ourselves. The FDA is a unique institution, bringing 
together intellectual resources from inside and outside 
government to make decisions on thousands of prod- 
ucts each year. Once courts dismiss core scientific judg- 
ments by the agency, there is no reason to believe they 
will limit themselves to this one medication. There is 
already political pressure against vaccines, antidepres- 
sants and other psychotropic medication, and certain 
cell-derived therapies. If judges begin to dictate the 
terms of medication access, then others will seek to use 
ideology and influence to advance their agendas. 

Respect for the integrity of the FDA underlies decades 
of progress in using science to save lives. Cracks in this 
foundation are as dangerous as they are unwarranted. 

—Margaret Hamburg and Joshua Sharfstein 
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University of California, Davis, cell biologist Paul Knoepfler, in Vice, 
about a recently ended Duke University program that gave autistic children unproven stem cell 
treatments. Allowed under FDA's “compassionate use” rules, it charged parents up to $15,000. 


India to build gravitational wave detector 


fter more than a decade of planning, India this week 
approved building a gargantuan detector to sense 
ripples in space and time called gravitational waves, 
extending an international network of such devices. 
Expected to be operational by the decade’s end, the 
new detector, near Aundha in western India, will 
be nearly identical to those in the Laser Interferometric 
Gravitational-Wave Observatory (LIGO) in 
Louisiana and Washington state. Each of those 
detectors is an L-shaped optical device called 
an interferometer with arms 4 kilometers long 


Ghana OKs new malaria vaccine 


INFECTIOUS DISEASES | Although results 
from clinical trials are still pending, 
Ghana has become the first country to 
approve a new malaria vaccine, which will 
likely be less expensive than an existing 
vaccine. The new vaccine, called R21/ 
Matrix-M, is unlikely to be used widely 
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LIGO-India will deploy a 
set of ultraprecise, 
40-kilogram mirrors left over 
from the U.S. LIGO project. 


before the World Health Organization 
(WHO) endorses it, which would enable 
international funders to purchase the 
vaccine. R21/Matrix-M showed promising 
results in a preliminary trial involving 
450 children, but researchers have yet 

to report final results from a trial with 
4800 children in four countries. Ghana 
is already using the first malaria vaccine, 


that make ultraprecise measurements of space in per- 
pendicular directions. LIGO-India will use a spare set 
of LIGO mirrors and lasers. India will spend $320 mil- 
lion to construct the vacuum chamber and buildings 
to house the device. Since 2015, LIGO and Italy’s Virgo 
detector have detected the fleeting signals of gravita- 
tional waves from 90 high-energy celestial events in 
which two massive objects such as black 
holes collided. Adding another detector 
should enable scientists to more precisely 
pinpoint sources in the sky. 


endorsed by WHO and called Mosquirix; 
more than 450,000 children had received 
at least one dose by December 2022. 
R21/Matrix-M will likely require at least 
four doses per child at $3 a dose, and 
global health experts caution that vac- 
cines should be weighed against cheaper 
malaria-control tools, such as insecticide- 
treated bed nets. 
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U.S. wields light touch on fusion 


ENERGY | The U.S. Nuclear Regulatory 
Commission (NRC) decided last week to 
regulate future fusion power plants under 
standards used for particle accelerators and 
radioactive medical technologies, not the 
more stringent rules it requires for nuclear 
fission power plants. The Fusion Industry 
Association (FIA) had sought the decision, 
contending that fusion power generation will 
be less dangerous than fission plants and 
shouldn’t be regulated as tightly. Operators 
of fusion plants will need to handle small 
amounts of tritium—a short-lived radio- 
active isotope of hydrogen—and safely dis- 
pose of low-level waste consisting of reactor 
parts made radioactive by fusion reactions. 
The hazards are similar to those of particle 
accelerators, according to FIA. Although a 
commercial fusion reactor remains possibly 
decades away, the NRC decision “will give 
fusion developers the regulatory certainty 
they need to innovate,’ FIA says. 


Early test for Parkinson’s disease 


BIOMEDICINE | A large study has shown 
that a test can indicate a person has 
Parkinson’s disease before they start having 
symptoms. The chronic degenerative disease 
currently lacks a definitive biochemical test. 
A research team recruited 1123 participants, 
some of whom had symptoms of Parkinson’s, 
and used spinal taps to measure their levels 
of a protein called alpha-synuclein, which 
clumps and damages brain cells in people 
with the disease. Study participants whose 
alpha-synuclein clumped to an extent above 
a threshold level were deemed to have 
Parkinson's. In 88% of the subjects, the test 
accurately indicated whether they had the 
disease, researchers reported last week in The 
Lancet Neurology. The new approach is likely 
too costly and invasive to be used widely for 
screening but may inform research into treat- 
ments, scientists say. 


Roll-up space telescope mirrors 


ASTRONOMY | Researchers have developed 
a technique to produce flexible, high-quality 
mirrors for space telescopes. The mirrors 
could be rolled up and stored compactly in 
rockets and enable more-powerful orbiting 
observatories. A team at the Max Planck 
Institute for Extraterrestrial Physics created 
prototypes up to 30 centimeters wide and 
plans to make larger ones. The mirrors are 
made by rotating liquid in a vacuum chamber 
as a parabolic “mold, onto which chemical 
vapors are deposited to form a polymer layer. 
A reflective metal layer is then added. They 
described the work in Adaptive Optics. 
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Roman emperors played winemaker for a day 


n ancient Rome, wine mattered so much that the emperor himself opened each 

year's wine grape harvest by cutting a symbolic cluster of grapes. Now, archaeo- 

logists have discovered that an elaborate facility just outside Rome probably was 

built specially for this occasion. They excavated the unusual, 1OOO-square-meter 

space in 2017 and 2018 and dated it to about 240 C.E. The facility was like a dude 
ranch, where the nobility played winemaker for a day. It featured high-end imported 
red marble on which workers squashed grapes; juice flowed through marble-lined 
channels into fermentation vats beneath a decorated floor. The slippery floor material, 
unusual in a Roman winery, suggested the facility was designed for showing off, not 
efficiency, the research team writes this week in Antiquity. 


Aceremonial winery for Rome’s nobility featured wine fountains and private rooms, perhaps for dining. 


Black doctors are Rx for longevity 


EQuity | Black people who reside in 

US. counties with at least one practicing 
Black physician live longer than those 
elsewhere—and their longevity increases 
with the number of Black doctors, a study 
has found. The analysis is among the first 
to show that increasing diversity among 
physicians can help lessen persistent racial 
disparities in mortality. But research- 

ers from the U.S. Health Resources and 
Services Administration found that only 
half of the more than 3000 U.S. counties 
had even one Black physician in at least 
one of 3 years included in the study—2009, 
2014, and 2019. The findings about longev- 
ity may reflect that some Black people seek 
care from Black doctors based on shared 
culture, the team wrote last week in JAMA 
Network Open. But medical care should not 
be segregated, and all physicians should 
increase their “cultural competency,’ tailor- 
ing care to patients’ cultural needs, the 
authors wrote. 


Editors bolt over author fees 


PUBLISHING | More than 40 aca- 

demic editors at NewroImage and an 
affiliated journal resigned this week 

to protest the fee that the owner, pub- 
lishing giant Elsevier, charges to make 
papers open access. In a statement, they 
decried as “unethical and unsustain- 
able” NeuroImage’s $3450 sum, billed 

to authors to provide papers free to 

read upon publication. NeuwroImage has 
published all articles open access since 
2020. A companion journal, Newrolmage: 
Reports, was introduced in 2021. The 
departing editors say they asked Elsevier 
to reduce Neurolmage’s fee, and it 
refused. Now they plan to launch a new, 
open-access journal—which they hope 
will “replace NewroImage as the top 
journal in our field”—to be published 

by the nonprofit MIT Press. An Elsevier 
spokesperson says the publisher set 
NeuroImage’s fee below the market aver- 
age for publications of similar quality. 
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The 4776-meter-tall Pao Pao | 
Seamount (right) in the South 
Pacific Ocean has been mapped by 
sonar. Many others haven't. 


A‘mind boggling’ 19,000 seamounts discovered 


New maps of undersea volcanoes could aid in studies of plate tectonics and ocean mixing 


By Paul Voosen 


he U.S. submarine fleet’s biggest ad- 

versary lately hasn’t been Red Oc- 

tober. In 2005, the nuclear-powered 

USS San Francisco collided with an 

underwater volcano, or seamount, 

at top speed, killing a crew member 
and injuring most aboard. It happened 
again in 2021 when the USS Connecticut 
struck a seamount in the South China Sea, 
damaging its sonar array. 

With only one-quarter of the sea floor 
mapped with sonar, it is impossible to 
know how many seamounts exist. But ra- 
dar satellites that measure ocean height 
can also find them, by looking for subtle 
signs of seawater mounding above a hid- 
den seamount, tugged by its gravity. A 2011 
census using the method found more than 
24,000. High-resolution radar data have 
now added more than 19,000 new ones. 
The vast majority—more than 27,000— 
remain uncharted by sonar. “It’s just mind 
boggling,” says David Sandwell, a marine 
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geophysicist at the Scripps Institution of 
Oceanography, who helped lead the work. 

Published this month in Earth and 
Space Science, the new seamount catalog is 
“a great step forward,” says Larry Mayer, 
director of the University of New Hamp- 
shire’s Center for Coastal and Ocean Map- 
ping. Besides posing navigational hazards, 
the mountains harbor rare-earth minerals 
that make them commercial targets for 
deep-sea miners. Their size and distribu- 
tion hold clues to plate tectonics and mag- 
matism. They are crucial oases for marine 
life. And they are pot-stirrers that help 
control the large-scale ocean flows respon- 
sible for sequestering vast amounts of heat 
and carbon dioxide, says John Lowell, chief 
hydrographer of the National Geospatial- 
Intelligence Agency (NGA), which runs the 
U.S. military’s satellite mapping efforts. 
“The better we understand the shape of the 
sea floor, the better we can prepare [for cli- 
mate change].” 

After the USS San Francisco accident, 
Sandwell and his colleagues secured fund- 


ing from the Navy and NGA to hunt for 
seamounts with satellites. They identified 
thousands, including 700 particularly shal- 
low ones that posed hazards to subma- 
rines. But the team knew its first catalog 
was far from complete. Now, armed with 
data from high-resolution radar satellites, 
including the European Space Agency’s 
CryoSat-2 and SARAL from the Indian and 
French space agencies, the team can detect 
seamounts just 1100 meters tall—close to 
the lower limit of what defines a seamount, 
Sandwell says. 

Seamounts often occur in chains formed 
as tectonic plates ride over stationary 
plumes of hot rock rising from the mantle. 
As a result, the catalog will pay immedi- 
ate dividends for studies of Earth’s interior, 
says Carmen Gaina, a geophysicist at the 
Queensland University of Technology. It 
has already identified new seamounts in 
the northeast Atlantic Ocean that could 
help track the evolution of the mantle 
plume that feeds Iceland’s volcanoes. The 
survey also spotted seamounts near a ridge 
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in the Indian Ocean where fresh crust is 
made as tectonic plates spread apart. They 
suggest a surprising amount of volcanism 
in a region once thought to be magma 
starved, Gaina says. 

To biologists, seamounts’ steep slopes 
resemble crowded, boisterous skyscrapers 
for corals and other marine life. “They’re 
oases for biodiversity and biomass,” says 
Amy Baco-Taylor, a deep-sea biologist at 
Florida State University. Whales use them 
as waypoints. But biologists debate the role 
seamounts play in marine biodiversity: 
Are they home to genetically distinct spe- 
cies, like remote islands? Or do they serve 
as stepping stones for life to hopscotch 
through the oceans? By pushing up the 
density of seamounts, the new maps could 
strengthen the argument for the latter, 
Baco-Taylor says. 

They will also boost efforts to pro- 
tect biodiversity in international waters 
under a new marine protection treaty 
(Science, 10 March, p. 971). “We can’t pro- 
tect the things if we don’t know they’re 
there,” says Chris Yesson, a marine bio- 
logist at the Zoological Society of London’s 
Institute of Zoology. The maps will pro- 
vide a practical payoff, Yesson adds: “We 
won't waste our time as much.” Some of 
his colleagues, he says, once traveled to 
the Indian Ocean to study a seamount that 
turned out to be a phantom created by an 
error in presonar depth records. 

Nowhere will the new maps be as im- 
portant as in understanding the ocean’s 
globe-girdling conveyor belt of currents. 
The currents ferry heat from the equator 
to the poles, where the water cools and 
gains density until it plunges downward, 


carrying heat and carbon dioxide into the 
abyss. But the flip side of this perpetual 
motion machine—deep ocean waters defy- 
ing gravity and rising upward—has long 
been a mystery. The “upwelling” was once 
thought to happen evenly across the ocean, 
driven by turbulent waves at boundar- 
ies between deep ocean layers of differ- 
ent densities. Now, researchers believe it 
is concentrated at seamounts and ridges 
(Science, 25 March 2022, p. 1324). “There’s 
a zoo of interesting things that happen 
when you have topography,” says Brian 
Arbic, a physical oceanographer at the Uni- 
versity of Michigan, Ann Arbor. 

When ocean currents curl around sea- 
mounts, they create turbulent “wake vor- 
tices” that can provide the energy to push 
cold water up, says Jonathan Gula, a physi- 
cal oceanographer at the University of 
Western Brittany. In unpublished research, 
Gula and co-authors have found that these 
wake vortices make seamounts the lead- 
ing contributor to upward ocean mixing, 
and a central player in climate. Since the 
team relied on the old Scripps catalog, not 
the new one, the effect of the seamounts is 
probably even larger, Gula adds. 

The seamount catalog is sure to expand 
further with Seabed 2030, an international 
project to accelerate high-resolution sonar 
mapping that Mayer is helping lead. But 
space surveys will improve too. NASA’s Sur- 
face Water and Ocean Topography satellite, 
launched in December 2022, can measure 
the height of a water surface to within a 
couple of centimeters. Better remote sens- 
ing would be welcome, given the cost of so- 
nar mapping voyages, Mayer says. “I would 
love to see it threaten what I do.” & 


A bumpy ocean bottom 


Satellites have detected more than 43,000 seamounts. But only 16,000 have been charted in detail by sonar 


from ships and submarines. 


© Uncharted seamounts 


O Charted seamounts 


SCIENCE science.org 


SCIENTIFIC MISCONDUCT 


Acclaimed 
physicist 
accused of 
copying thesis 


After superconductivity 


claims, Ranga Dias dogged 
by plagiarism allegations 


By Daniel Garisto 


n March, University of Rochester (U of 
R) physicist Ranga Dias made a block- 
buster announcement: His team had 
detected superconductivity at room tem- 
perature, in a material that did not need 
to be squeezed to incredibly high pres- 
sures. Many physicists regarded the claim 
warily because 6 months earlier, Nature 
had retracted a separate room-temperature 
superconductivity claim from Dias’s group, 
amid allegations of data manipulation. 
Now come accusations that Dias plagia- 
rized much of his Ph.D. thesis, completed in 
2013 at Washington State University (WSU). 
Undark, The New York Times, and Phys- 
ics magazine previously reported that his 
thesis contains many passages identical to 
those from a 2007 thesis written by James 
Hamlin at Washington University in St. 
Louis. Hamlin, now a high-pressure experi- 
mentalist at the University of Florida, and 
Simon Kimber, a physicist most recently at 
the University Burgundy Franche-Comté, 
have gone through the thesis by hand and 
say they have discovered more widespread 
examples of copying. In an analysis shared 
with Science, they find Dias’s thesis con- 
tains at least 6300 words—some 21% of 
the thesis—that are identical to passages 
from 17 other sources. Dias’s website at 
U of R also contains text that appears to 
have been copied without attribution from 
other sources, Hamlin and Kimber say. 
Experts who examined the analysis 
agree the thesis is heavily plagiarized. 
“It’s obvious,” says Lisa Rasmussen, a re- 
search ethicist at the University of North 
Carolina, Charlotte. A U of R spokesper- 
son noted that the plagiarism concerns 
are largely confined to the methodology 
and background section. But that doesn’t 
absolve Dias, says Vanja Pupovac, an expert 
in research integrity at the University of Ri- 
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in the Indian Ocean where fresh crust is 
made as tectonic plates spread apart. They 
suggest a surprising amount of volcanism 
in a region once thought to be magma 
starved, Gaina says. 

To biologists, seamounts’ steep slopes 
resemble crowded, boisterous skyscrapers 
for corals and other marine life. “They’re 
oases for biodiversity and biomass,” says 
Amy Baco-Taylor, a deep-sea biologist at 
Florida State University. Whales use them 
as waypoints. But biologists debate the role 
seamounts play in marine biodiversity: 
Are they home to genetically distinct spe- 
cies, like remote islands? Or do they serve 
as stepping stones for life to hopscotch 
through the oceans? By pushing up the 
density of seamounts, the new maps could 
strengthen the argument for the latter, 
Baco-Taylor says. 

They will also boost efforts to pro- 
tect biodiversity in international waters 
under a new marine protection treaty 
(Science, 10 March, p. 971). “We can’t pro- 
tect the things if we don’t know they’re 
there,” says Chris Yesson, a marine bio- 
logist at the Zoological Society of London’s 
Institute of Zoology. The maps will pro- 
vide a practical payoff, Yesson adds: “We 
won't waste our time as much.” Some of 
his colleagues, he says, once traveled to 
the Indian Ocean to study a seamount that 
turned out to be a phantom created by an 
error in presonar depth records. 

Nowhere will the new maps be as im- 
portant as in understanding the ocean’s 
globe-girdling conveyor belt of currents. 
The currents ferry heat from the equator 
to the poles, where the water cools and 
gains density until it plunges downward, 


carrying heat and carbon dioxide into the 
abyss. But the flip side of this perpetual 
motion machine—deep ocean waters defy- 
ing gravity and rising upward—has long 
been a mystery. The “upwelling” was once 
thought to happen evenly across the ocean, 
driven by turbulent waves at boundar- 
ies between deep ocean layers of differ- 
ent densities. Now, researchers believe it 
is concentrated at seamounts and ridges 
(Science, 25 March 2022, p. 1324). “There’s 
a zoo of interesting things that happen 
when you have topography,” says Brian 
Arbic, a physical oceanographer at the Uni- 
versity of Michigan, Ann Arbor. 

When ocean currents curl around sea- 
mounts, they create turbulent “wake vor- 
tices” that can provide the energy to push 
cold water up, says Jonathan Gula, a physi- 
cal oceanographer at the University of 
Western Brittany. In unpublished research, 
Gula and co-authors have found that these 
wake vortices make seamounts the lead- 
ing contributor to upward ocean mixing, 
and a central player in climate. Since the 
team relied on the old Scripps catalog, not 
the new one, the effect of the seamounts is 
probably even larger, Gula adds. 

The seamount catalog is sure to expand 
further with Seabed 2030, an international 
project to accelerate high-resolution sonar 
mapping that Mayer is helping lead. But 
space surveys will improve too. NASA’s Sur- 
face Water and Ocean Topography satellite, 
launched in December 2022, can measure 
the height of a water surface to within a 
couple of centimeters. Better remote sens- 
ing would be welcome, given the cost of so- 
nar mapping voyages, Mayer says. “I would 
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A bumpy ocean bottom 


Satellites have detected more than 43,000 seamounts. But only 16,000 have been charted in detail by sonar 


from ships and submarines. 
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By Daniel Garisto 


n March, University of Rochester (U of 
R) physicist Ranga Dias made a block- 
buster announcement: His team had 
detected superconductivity at room tem- 
perature, in a material that did not need 
to be squeezed to incredibly high pres- 
sures. Many physicists regarded the claim 
warily because 6 months earlier, Nature 
had retracted a separate room-temperature 
superconductivity claim from Dias’s group, 
amid allegations of data manipulation. 
Now come accusations that Dias plagia- 
rized much of his Ph.D. thesis, completed in 
2013 at Washington State University (WSU). 
Undark, The New York Times, and Phys- 
ics magazine previously reported that his 
thesis contains many passages identical to 
those from a 2007 thesis written by James 
Hamlin at Washington University in St. 
Louis. Hamlin, now a high-pressure experi- 
mentalist at the University of Florida, and 
Simon Kimber, a physicist most recently at 
the University Burgundy Franche-Comté, 
have gone through the thesis by hand and 
say they have discovered more widespread 
examples of copying. In an analysis shared 
with Science, they find Dias’s thesis con- 
tains at least 6300 words—some 21% of 
the thesis—that are identical to passages 
from 17 other sources. Dias’s website at 
U of R also contains text that appears to 
have been copied without attribution from 
other sources, Hamlin and Kimber say. 
Experts who examined the analysis 
agree the thesis is heavily plagiarized. 
“It’s obvious,” says Lisa Rasmussen, a re- 
search ethicist at the University of North 
Carolina, Charlotte. A U of R spokesper- 
son noted that the plagiarism concerns 
are largely confined to the methodology 
and background section. But that doesn’t 
absolve Dias, says Vanja Pupovac, an expert 
in research integrity at the University of Ri- 
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jeka, in Croatia. “[It] demonstrates, at best, 
the [Ph.D.] candidate’s gross negligence 
and, at worst, their lack of understanding of 
the topic,” she said in an email. 

Through a spokesperson, Dias declined 
to answer questions and stated he is “ad- 
dressing these issues directly with his the- 
sis adviser.” The U of R spokesperson also 
declined to comment further until an aca- 
demic integrity process at WSU was com- 
plete. WSU did not respond to a request 
for comment. 

Kimber became suspicious about Dias’s 
work when Hamlin pointed out a detail in 


In systems of itinerant fermions the rele- 
vant excitations were identified as ‘qua- 
siparticles’ which are in a one-to-one 
correspondence with the single-parti- 
cle momentum eigenstates of a non- 
interacting system, but which have a 
modified mass and a finite lifetime due 
to the quasiparticle interaction. 


“Tt just stands out so incredibly,’ Kimber 
says. Plugging it into Google Scholar, he 
found it was identical to a sentence from 
a 1998 paper. 

Wondering how much more there was to 


Accusations of scientific misconduct are mounting for University of Rochester physicist Ranga Dias. 


a 2021 Physical Review Letters (PRL) paper 
on which Kimber and Dias were co-authors: 
A plot about manganese sulfide was nearly 
identical to a plot about a different mate- 
rial, germanium selenide, in Dias’s thesis. 
On 20 March, PRL attached an expres- 
sion of concern to the paper, which is now 
under investigation because of questions 
about the “integrity of the data.” 

Kimber had heard about the similari- 
ties between Hamlin’s and Dias’s theses 
and wanted to see for himself. He was soon 
struck by a line about a complex topic that 
seemed tangential to Dias’s thesis: 


228 21 APRIL 2023 + VOL 380 ISSUE 6642 


find, Kimber joined up with Hamlin to take 
on the arduous task of searching through 
the rest of the thesis. They checked sen- 
tences by hand for similarities to other pa- 
pers in Google Scholar. The manual effort 
was required, Kimber says, because auto- 
mated plagiarism detectors are sometimes 
inaccurate. 

One of the 17 sources Hamlin and 
Kimber identified is a 1999 paper by Dias’ 
thesis adviser, WSU materials scientist 
Choong-Shik Yoo. Yoo says he had spotted 
the apparent duplications himself while 
reviewing the thesis. “I thought that that 


was just a simple mistake, so I didn’t think 
of that as a big deal,” says Yoo, who adds 
that his paper is referenced elsewhere in 
the thesis. 

The university’s academic honesty 
policy, however, “makes no distinction 
between intentional and unintentional 
plagiarism.” According to Yoo, Dias sub- 
mitted a request to correct his thesis on 
30 March and it is now under review with 
the university’s Academic Integrity Hear- 
ing Board. Although WSU does not spec- 
ify the sanctions the board could impose, 
Rasmussen says plagiarism cases this seri- 
ous could include revocation of the thesis. 

Yoo did not review the files Dias submit- 
ted, but says, “I’m pretty sure his corrected 
thesis has implemented the correct infor- 
mation.” But Pupovac says this kind of re- 
vision creates serious problems. “Allowing 
correction of the thesis sends the message 
that dishonesty can be rectified without 
significant consequences, which under- 
mines the trust of scientists in the integrity 
of the academic system, and also the trust 
of the public in science,” Pupovac says. 

Asked how the plagiarism allegations, 
on top of other charges of data manipu- 
lation, affect her view of Dias’s super- 
conductivity claims, Rasmussen says, “It 
helps to paint a picture of someone as not 
really caring about standards. It suggests 
that they don’t mind cutting corners; that 
they may not have as many original ideas 
as they’re presenting themselves to have.” 

The apparent plagiarism did not stop 
with the 2013 thesis. Hamlin and Kimber 
also found that descriptions of research 
on Dias’s websites, for both U of R and 
Harvard University, where he completed a 
postdoc, contain several passages identical 
to at least three other sources. One page 
on the site, about research in 2D materi- 
als, has sentences that match a description 
on the website of University of Washington 
researcher Matthew Yankowitz. “I am es- 
sentially certain that the text ... was pla- 
giarized from my own,” Yankowitz said in 
an email. 

In the meantime, several research 
groups have failed to replicate Dias’s lat- 
est superconductivity claim. Faith in the 
result may be diminishing, but interest in 
room-temperature superconductivity re- 
mains hot. As Dias’s Harvard website puts 
it, “Le]fforts to identify and develop new 
superconducting materials continue to 
increase rapidly, motivated by both fun- 
damental science and the prospects for 
applications.” 

A 2010 Nature paper by a group in Japan 
started with almost identical words. 


Daniel Garisto is a science journalist in New York City. 


science.org SCIENCE 


PHOTO: SCOTT PETERSON/GETTY IMAGES 


ECOLOGY 


Scientists plan a comeback for 
Ukraine’s war-ravaged forests 


Destruction could open the door to management reforms 


By April Reese 


n addition to its horrific human toll, 

the war in Ukraine has inflicted wide- 

spread damage on the nation’s forests. 

Bombs and missiles have sparked thou- 

sands of fires, and “artillery breaks 

trees in half—it basically mows the for- 
est,” says Brian Milakovsky, a U.S.-born for- 
est ecologist who lived in eastern Ukraine 
before fleeing the country. 

Ironically, some forestry experts say the 
destruction could lead to a major overhaul 
of how Ukraine manages its forests, changes 
they say will help ensure these landscapes 
can better cope with climate change, sup- 
port biodiversity, and protect water quality. 
Optimistic that Ukraine will prevail in the 
war, the researchers are already planning 
for this greener postwar future. Milakovsky 
and Sergiy Zibtsev, a forest scientist at the 
National University of Life and Environ- 
mental Sciences of Ukraine, shared their vi- 
sion during a webinar held last week by the 
Yale School of the Environment. 

“We need to look at solutions that 
lead to different forest landscapes,” says 
Milakovsky, who continues to work on 
Ukraine forest issues from his new home in 
Latvia. “Because the status quo just is really 
struggling under climate change and war.” 

Even before the current war, Ukraine’s 
forests were considered some of the world’s 
most damaged. The expansion of agricul- 
ture in this major food exporter had vastly 
reduced forest cover; nearly half of Ukraine 
is now cropland. In many of the forests that 
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remained, open stands of fire-adapted Scots 
pine had been replaced by crowded, more 
fire-prone plantations. The dense stands 
were encouraged by Soviet-era policies that 
aimed to “pack as much wood as you can on 
every hectare,’ Milakovsky says. But if “fire 
gets in, it just is death.” 

Plantations are common in eastern 
Ukraine, where much of the fiercest fight- 
ing is now taking place. Since the Rus- 
sian invasion began in February 2022, 
almost 20,000 fires have burned across 
755,638 hectares, according to remote sens- 
ing data. Farmers accidentally started some 
fires when they burned fields to clear them, 
but weaponry ignited many others. Forests 
also have been damaged by the construc- 
tion of trenches, bunkers, and roads. Some 
of the worst damage is along rivers, such as 
the Siverskyi Donets, that have become cru- 
cial lines of defense, Milakovsky says. 

Ukrainian law encourages foresters to 
replant plantations whenever an area is 
logged or burned. To try a different man- 
agement regime, they have to get special 
permission, and few seek it. “Economics, 
legislation, and habit” enable plantations 
to persist, Milakovsky said, despite increas- 
ing concerns that the monocultures do rela- 
tively little to support native species and 
can suck up scarce water. 

Researchers say the war damage presents 
an opportunity for a long overdue policy 
shift. The blazes and military activities are 
breaking up some plantations, for example, 
opening the door to creating more diverse 
mosaics of forest types, managed for a mix 
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Ukrainian tanks take cover in a pine plantation upd 


near Kreminna in February. 


of restoration and logging, Milakovsky said. 
That will require political will but could 
result in more resilient woodlands. Forests 
with mixed species and well-spaced trees 
of varying ages would be less susceptible 
to intense fires and the droughts that are 
expected to become more common as the 
region’s climate warms, Milakovsky says. 
“Ukraine is pretty dry, and it’s getting drier.” 

This year, a rainy winter has encouraged 
growth, priming forests for big burns, says 
Petro Testov, an ecologist with the Ukrai- 
nian Nature Conservation Group. “If next 
year will be dry, we could see very huge for- 
est fires like in 2020,” he says, when blazes 
tore through pine forests around Luhansk, 
killing 17 people. 

The researchers also hope to build 
groundwater protection into forest man- 
agement. Sandy landscapes like those in 
southeastern Ukraine allow water to soak 
into the ground and replenish aquifers, but 
plantations can interfere. When fires thin 
them, wetlands often reappear and ground- 
water levels rebound. Shrinking plantations 
could avoid the “repeated depletion” of 
these water resources, Milakovsky said. 

Zibtsev is planning to soon convene 
Ukrainian forest scientists—many of whom 
have relocated to other nations—to discuss 
how to advance these and other reforms 
and improve collaboration. Once the con- 
flict ends, he and Milakovsky also hope 
to resume work the two began before the 
war with local foresters around the city of 
Kreminna—now one of the hottest combat 
zones. The local partners agreed to test al- 
ternative management methods, such as 
allowing low-value areas to naturally re- 
generate. “They liked those [sustainable 
forestry] ideas,’ Zibtsev says, but worried 
about running afoul of government regula- 
tions. “All of them were just like: ‘Help us 
get some kind of [legal] protection,” 

Even as Ukraine’s forest scientists look 
to the future, the war’s continuing toll on 
the forests they have worked in for years 
is never far from their minds. Copernicus, 
the European Union’s satellite monitoring 
system, shows constellations of fires across 
eastern Ukraine, particularly between the 
cities of Kharkiv and Luhansk. 

Still, Zibtsev has little doubt that his 
vision of a conflict-free Ukraine, rich in 
sustainably managed forests, will be real- 
ized. “We expect within the next 20 years, 
there will be quite radical changes,” he says. 
“We're in a position to push this agenda. 
And we already have some progress.” & 


April Reese is a journalist in Aveiro, Portugal. 
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Al-driven robotics lab joins the 
hunt for materials breakthroughs 


Setup is the first fully automated effort seeking novel 
inorganic materials for emerging technologies 


By Robert F. Service, in San Francisco 


magine a cookbook with 150,000 tempt- 
ing dishes—but few recipes for making 
them. That’s the challenge facing an ef- 
fort at the Lawrence Berkeley National 
Laboratory (LBNL) known as the Ma- 
terials Project. It has used computers 
to predict some 150,000 new materials 
that could improve devices such as battery 
electrodes and catalysts. But the database’s 
users around the globe have managed to 
make just a fraction of these for testing, 
leaving thousands untried. “Synthesis has 
become the bottleneck,” says Gerbrand 
Ceder, a materials scientist at LBNL. 

Now, Ceder and his colleagues have mar- 
ried artificial intelligence (AD and robotics 
to eliminate that bottleneck. The AI system 
makes a best guess at a recipe for a desired 
material and then iterates the reaction con- 
ditions as robots try to create physical sam- 
ples. The new setup, known as the A-Lab, is 
already synthesizing about 100 times more 
new materials per day than humans in the 
lab can manage. “This is the way to go,” says 
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Ali Coskun, a chemist at the University of 
Freiburg who isn’t involved with the A-Lab, 
but attended the Materials Research Society 
meeting here last week, where the new AI 
approach was announced. 

Al-driven robotics labs are becoming 
commonplace among pharmaceutical com- 
panies searching for new drugs and even 
some academic materials labs (Science, 
13 December 2019, p. 1295) But those efforts 
primarily use liquid precursor compounds 
that are relatively straightforward to mix 
and process. “It’s a lot more difficult to do 
this with solid materials,” Coskun says. Syn- 
thesizing these materials typically requires 
mixing solid powders together and then 
adding different combinations of solvents, 
and experimenting with heat, drying time, 
and other inputs to try to get them to crys- 
tallize into the predicted material. 

The number of recipes is essentially 
infinite, Ceder says. Although computers 
can predict which final compounds should 
lead to better devices, “there is no theory 
for synthesis that tells us what can and 
cannot be made,” says Kristin Persson, who 
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LBNL’s fully automated A-Lab can churn out new upd 


materials 24/7 without human intervention. 


heads LBNL’s Materials Project and an- 
nounced the new A-Lab. 

Previous automation efforts randomly 
mixed compounds in search of new mate- 
rials, Ceder says, but the new AlI-driven ap- 
proach is more akin to the way traditional 
chemists do their jobs. The AI starts by 
coming up with a plausible way to synthe- 
size a material, using its understanding of 
chemistry. It guides robotic arms to select 
among nearly 200 different powdery start- 
ing materials, containing elements such as 
lithium, nickel, copper, iron, and manga- 
nese. After mixing the precursors, another 
robot parcels out the mix into a set of 
crucibles, which are loaded into furnaces 
where they can be mixed with gases such 
as nitrogen, oxygen, and hydrogen. The 
AI then determines how long to bake the 
different mixes, the temperatures, drying 
times, and so on. 

After the baking, a gumball-like dis- 
penser adds a ball bearing to each crucible 
and shakes it to grind the new substance 
into a fine powder that’s loaded onto a 
slide. A robot arm then grabs each sample 
and slides it into an x-ray machine or other 
equipment for analysis. Results are fed 
back into the Materials Project database of 
materials structures and properties, and if 
the outcome isn’t what was predicted, the 
AI setup iterates the reaction conditions 
and starts anew. 

LBNL researchers have spent the past 
several months working out the kinks in 
their system and testing it. In the pro- 
cess, the A-Lab has produced more than 
40 target materials—about 70% of the com- 
pounds it has set out to produce. “I have 
made more new compounds in the last 
6 weeks than my whole career,’ Ceder says. 

LBNL’s AI materials lab may not be 
alone for long. In a 3 April preprint, re- 
searchers from the Samsung Advanced 
Institute of Technology reported that they, 
too, have set up a computer-driven robot- 
ics lab to search for new electronic mate- 
rials. Results from that report show their 
setup performed more than 200 reactions 
to make 35 inorganic compounds, includ- 
ing certain oxides commonly used in bat- 
tery electrodes, solid oxide fuel cells, and 
superconductors. In each stage of their ro- 
botic experiments “AI is used to some de- 
gree,” says Samsung’s Jeong-Ju Cho. 

Ceder notes that despite the move to 
fully automated synthesis and analysis, re- 
searchers are just as likely as ever to make 
unexpected discoveries. “That’s no differ- 
ent with the A-Lab.” Except now, the hits 
and the surprises will likely come faster. & 
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Yeast are engineered to thrive on light 


Experiment shows ease by which organisms could harness sunlight to produce energy 


By Elizabeth Pennisi 


east are carb lovers, sustaining them- 

selves by fermenting sugars and 

starches from sources such as dough, 

grapes, and grains, with bread, wine, 

and beer as happy byproducts. Now, 

researchers have made one type of 
yeast a little less dependent on carbs by en- 
abling it to use light as energy. 

The work, reported last week on the pre- 
print server bioRxiv, is “the first step in 
more complex modes of engineering artifi- 
cial photosynthesis,” says Magdalena Rose 
Osburn, a geobiologist at North- 
western University who was not 
involved in the research. It also 
recapitulates a key evolutionary 
transition—the harnessing of 
light. "It is extraordinary,’ says 
Felipe Santiago-Tirado, a fungal 
cell biologist at the University 
of Notre Dame. “To some ex- 
tent, it’s like turning an animal 
into a plant.” 

Well, not quite. To convert 
carbon dioxide into sugars that 
fuel life on Earth, plants rely 
on a protein complex that in- 
cludes chlorophyll to shuttle 
both electrons and _ protons, 
which perform chemical re- 
actions and transfer energy. 
Researchers have been work- 
ing for years to recreate pho- 
tosynthesis to explore how to 
use light more efficiently as an 
energy source for solar pan- 
els and other applications and 
to breed plants—and other organisms— 
to be more productive. 

But the chlorophyll complex requires 
many other molecules to do its job. So 
Anthony Burnetti, a geneticist at the Geor- 
gia Institute of Technology, and Georgia 
Tech evolutionary biologist William Ratcliff 
sought a simpler solution. They homed in 
on a protein known as rhodopsin, which 
doesn’t require a large molecular entou- 
rage. It’s a solution nature has settled on as 
well: Bacteria, some protists, marine algae, 
and even algal viruses use rhodopsin to con- 
vert light into usable energy, often to pump 
protons for cellular functions. 

The researchers began by inserting a rho- 
dopsin gene that belonged to a marine bac- 
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terium into brewer’s yeast (Saccharomyces 
cerevisiae) in a petri dish. Burnetti hoped 
the rhodopsin would find its way into the 
yeast’s vacuole, an enzyme-laden sac that 
degrades unneeded proteins. An energy 
molecule called adenosine triphosphate 
(ATP) fuels the process by pumping protons 
into the vacuole to make its interior acidic— 
optimal for degradation. 

Burnetti wondered whether light energy 
could do that job instead. But the team’s 
first effort misfired when the rhodopsin 
protein made by the gene went to a differ- 
ent compartment known not for protein 


The blue cell walls of individual yeast cells surround rhodopsin (green), 
which helps those cells grow faster. 


degradation, but for protein synthesis. So 
Burnetti looked instead for rhodopsin al- 
ready known to exist in vacuoles. He set- 
tled on using one from corn smut, a fungal 
pathogen. By attaching a green fluorescent 
tag to the protein, he and his colleagues 
verified that it had localized to the yeast’s 
vacuole, as they hoped. 

Graduate student Autumn Peterson, a 
member of Burnetti’s team, went a step 
further to prove this engineered yeast was 
indeed using light. She grew the new strain 
in the same dish as the original, unaltered 
yeast and exposed it to green light, the 
wavelength rhodopsin is most sensitive to. 
The cells in the light-sensing strain had 
shorter lives but reproduced fast enough to 


outgrow the nonlight sensing yeast by 0.8%, 
the team found. That’s a “massive advan- 
tage,” says Santiago-Tirado. Over time, in 
the light, Peterson expects the light-using 
cells to eventually replace the unaltered 
ones just as early light users might have re- 
placed their competitors in nature eons ago. 

Burnetti and his colleagues think light in- 
duces the rhodopsin to pump more protons 
into the vacuole, relieving the cells’ need to 
expend ATP for this task and instead free- 
ing up that energy to help the cell grow in 
other ways. Increasing the acidity inside the 
vacuole may decrease it outside the vacuole, 
causing enzymes there to work 
faster and wear out sooner, which 
may also help explain the higher 
death rate among these altered 
cells. Whichever way it’s working, 
“Tt is clearly of benefit to the yeast 
cells,’ says Michael McMurray, 
a molecular biologist at the Uni- 
versity of Colorado Anschutz 
Medical Campus. 

But the experiment may not 
reveal much about how rho- 
dopsin use evolved in nature. “I 
think the authors overemphasize 
the evolutionary — significance 
of their work, says Robert 
Blankenship, an emeritus bio- 
chemist at Washington University 
in St. Louis. “This is an artificial 
construct and is not the product 
of natural evolution.” 

Others think the work can have 
industrial, medical, and_ basic 
research applications. Alaattin 
Kaya, a biologist who studies ag- 
ing at Virginia Commonwealth University, 
says these yeast cells can help clarify why 
vacuole acidification over the life of a cell 
sometimes seems to cause mitochondria to 
malfunction and in turn accelerate aging. 
He would love to add rhodopsin to mito- 
chondria themselves to observe its impact. 

Burnetti would like to target mitochon- 
dria as well, but for a different reason. “Even 
though it seems to have never happened 
in nature, we definitely plan to eventually 
put rhodopsin into the mitochondrion.” 
Because mitochondria can make ATP effi- 
ciently, adding rhodopsin could provide a 
lot of energy directly from the Sun, just as 
photosynthesis does. In that regard, yeast 
would then be a little more like plants. 
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Tailored melanoma vaccine may stave off cancer 


Small trial offers first clinical evidence that supports a personalized vaccine approach 


By Jocelyn Kaiser 


novel cancer vaccine tailored to ge- 

netic changes in a person’s tumor is 

showing promise in the clinic. In a 

study of about 150 people who had 

surgery for melanoma, a type of skin 

cancer, those given a personalized 
vaccine along with an immunotherapy drug 
were more likely to remain free of cancer 
18 months later than patients who did not 
receive the vaccine. 

The results, reported this week at the an- 
nual meeting of the American Association 
for Cancer Research (AACR), offer 
the first clear evidence that a vac- 
cine designed to target mutations 
within a patient’s tumor can prevent 
its regrowth. That would be a mile- 
stone for the cancer vaccine field, 
which has struggled for decades to 
show results. It could also add to a 
growing arsenal of drugs, known as 
immunotherapies, that harness the 
immune system to fight cancer. “I 
was really, really excited to see these 
data,” says Patrick Ott of the Dana- 
Farber Cancer Institute, who works 
on similar vaccines. Although small, 
the new study is “a very exciting first 
step,” says cancer vaccine researcher 
Nina Bhardwaj of the Icahn School of 
Medicine at Mount Sinai. 

Cancer vaccines aim to teach the 
immune system’s T cells to attack a 
tumor by exposing them to a pro- 
tein, or antigen, that pokes out from 
a cancer cell. But most vaccines so 
far haven’t worked well because the same 
antigens found on tumors also appear on 
normal cells. 

In the early 2010s, as DNA sequencing 
costs dropped, some scientists turned instead 
to sequencing the mutations in a patient’s 
tumor, then creating a vaccine to deliver a 
few of the corresponding mutated proteins, 
known as neoantigens, which are found only 
on the tumor cells. Several small trials pub- 
lished since 2015 by Ott’s team and others 
have shown that neoantigen vaccines can 
stimulate vaccine-specific T cells in patients 
with solid tumors such as melanoma, colon, 
lung, and brain cancer, and at least in mela- 
noma, may curb cancer growth. 

To show this more definitively, Merck and 
Moderna conducted a randomized trial for 
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patients who had advanced melanoma that 
had spread to lymph nodes and sometimes 
other sites, but that had been surgically re- 
moved. All got a type of drug, known as a 
checkpoint inhibitor, that blocks a crucial 
protein from enabling tumors to evade T 
cells. Two-thirds also got vaccine infusions 
every 3 weeks for about 4 months. Like Mod- 
erna’s COVID-19 vaccine, the cancer vaccine 
delivered messenger RNA (mRNA) wrapped 
in lipid nanoparticles into cells, instructing 
them to make proteins—in this case, up to 
34 tumor neoantigens per patient. 

In December 2022, the companies made 


Tailored cancer vaccines help train T cells (white) to attack tumors. 


a splash when they reported that patients 
receiving the vaccine were 44% less likely to 
die or have a recurrence of their cancer. At 
the AACR meeting, academic collaborators 
shared more details: Eighty-four of the 107, or 
79%, were still in remission after 18 months, 
compared with only 31 of 50 (62%) patients 
who got the checkpoint inhibitor alone. 
“These data give a very, very encouraging 
signal,” says Jeffrey Weber of NYU Langone’s 
Perlmutter Cancer Center, the trial’s principal 
investigator. 

Also encouraging is that the vaccine 
worked regardless of how many mutations 
the person’s melanoma tumor had, sug- 
gesting it could work for cancer types with 
fewer mutations. With less to distinguish 
them from normal cells, such cancers tend 


to resist immunotherapy drugs. A larger 
study starting later this year aims to confirm 
these results and reveal whether the vac- 
cine extends patients’ lives, measures that 
could encourage regulators to approve it. For 
now, “these [are] intriguing early findings,” 
says immunotherapy researcher Suzanne 
Topalian of Johns Hopkins University. Like 
other researchers, she hopes to see more de- 
tails, including evidence that patients who 
did well made T cells specific to the neo- 
antigens and didn’t just get an immune boost 
from the vaccine’s nanoparticles. Weber 
says those data will be reported in papers the 
team is submitting to journals. 

Other companies are also testing 
neoantigen vaccines in randomized 
trials. BioNTech and Genentech ex- 
pect to report early results this year 
for a neoantigen mRNA vaccine for 
metastatic melanoma that can’t be 
surgically removed—a tougher chal- 
lenge partly because the patients 
have weakened immune systems. And 
Gritstone bio is testing a neoantigen 
mRNA vaccine against metastatic 
colon cancer; to boost the immune 
response, it is combined with a modi- 
fied virus carrying the neoantigens. 
The Gritstone team reported in 
Nature Medicine in August 2022 that 
in several cancer patients, this re- 
sulted in “very significant numbers 
of T cells,’ according to Bhardwaj, a 
promising sign of efficacy. 

One of the most intriguing studies 
so far tested a BioNTech and Genen- 
tech neoantigen vaccine for pancreatic 
cancer. Investigators reported last summer 
that eight of 16 patients in a trial had T 
cell responses to the vaccine and were still 
cancer-free up to 2.5 years later. The other 
eight did not show an immune response and 
six had relapsed by 18 months. The compa- 
nies plan to launch a randomized trial of that 
vaccine for pancreatic cancer this year. 

Because pancreatic cancers have few mu- 
tations, “you might think this is the last tu- 
mor type” that a neoantigen vaccine would 
work for, says principal investigator Vinod 
Balachandran of Memorial Sloan Kettering 
Cancer Center, who presented full details at 
AACR and whose team has a paper in press. 
“Tf you can even do this in pancreatic cancer, 
this is very encouraging for testing personal- 
ized vaccines” for other cancers. 
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ehind a guard shack and warning 
signs on the sprawling campus of 
Los Alamos National Laboratory 
is a forested spot where scien- 
tists mimic the first moments of 
a nuclear detonation. Here, in the 
Dual-Axis Radiographic Hydro- 
dynamic Test (DARHT) facility, 
they blow up models of the bowl- 
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U.S. labs are overhauling’ 
the nuclear stockpile. 


Can they validate the weapons 


without bomb tests? 


1 : ae 


By Sarah Scoles, 
in Los Alamos, New Mexico 


ing ball-size spheres of plutonium, or “pits,” 
at the heart of bombs—and take x-ray pic- 
tures of the results. 

In a real weapon, conventional explosives 
ringing an actual pit would implode the plu- 
tonium to a critical density, triggering an 
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A Cold War stalwart, the Titan II missile (seen in an Arizona museum) carried warheads that had been tested. Weapons must now be certified without tests. 


explosive fissile chain reaction. Its energy 
would drive the fusion of hydrogen isotopes 
in the weapon’s second stage, generating yet 
more neutrons that would split additional 
fission fuel. 

This fission-fusion-fission process eats up 
some of the atoms’ mass and, according to 
E=mc’, Albert Einstein’s famous equation, 
releases ferocious amounts of energy. That’s 
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why a warhead about 1 meter long can ex- 
plode with the force of a megaton of TNT. If 
dropped on a city like Washington, D.C., it 
would instantly vaporize an area more than 
2.5 kilometers across while crumpling build- 
ings much farther out with its radioactive 
blast. It would kill nearly half a million 
people and injure or sicken almost as many. 

DARHT’s experiments take place within 
a steel vessel shaped like a diving bell. The 
mock pits, made of dense metals such as 
lead, tantalum, or depleted uranium, have 
properties similar to plutonium—minus 
its tendency to fission. As the explosive 
charges are detonated, two perpendicular 
beams of x-rays document the pit’s implo- 
sion like high-speed cameras. Weapons 
scientists compare those pictures with clas- 
sified supercomputer simulations of the 
bomb blasts to see how well the real 
and digital worlds match. 

Facilities like DARHT have been 
important since 1992, when the 
Department of Energy’s (DOE's) 
three weapons labs—Los Alamos, 
Lawrence Livermore National 
Laboratory, and Sandia National 
Laboratory—stopped full-fledged 
tests of nuclear weapons. By 1996, 
the United States had signed the 
Comprehensive Nuclear-Test-Ban 
Treaty—credited not only with stop- 
ping the environmental damage of 
nuclear testing, but also with disin- 
centivizing new weapons designs. 

Without tests, however, the only 
things ensuring that warheads work 
are facilities like DARHT, com- 
puter simulations from “weapons 
codes,” and a cache of data from 
the old days of nuclear testing. For 
relatively minor changes to old 
weapons—new fuses, fresh top-ups of the 
hydrogen isotope tritium—that has been 
enough. Every year, DOE’s National Nuclear 
Security Administration (NNSA) and the 
Department of Defense have certified the 
stockpile, an assessment that means they 
are convinced the weapons will work when 
they’re supposed to, as they’re supposed to— 
and not do anything when they’re not sup- 
posed to. “Because we’ve blown up so many 
of them, these things are incredibly reliable,” 
says Geoff Wilson, director of the Center for 
Defense Information at the Project on Gov- 
ernment Oversight, which argues nuclear 
weapons spending should be reduced. 

But now the stockpile is getting an over- 
haul, the biggest in decades. This fiscal 
year, NNSA has a record $22.2 billion bud- 
get. Much of the money will go to produc- 
ing new plutonium pits to replace those in 
the arsenal and to modernizing four war- 
heads. A fifth weapon, dubbed the W93—a 


SCIENCE science.org 


submarine-launched warhead—is a new de- 
sign program. “It’s really the first warhead 
program we've had since the end of the Cold 
War” that isn’t a life extension or modern- 
ization of an existing weapon, says Marvin 
Adams, NNSA’s deputy administrator for 
defense programs. 

The work has become more urgent, with 
the post-Cold War calm turning stormy 
again. Russia has backed out of its only 
remaining major arms-control treaty with 
the United States, while making regular nu- 
clear threats during its invasion of Ukraine. 
China is thought to be expanding its stock- 
pile, while Iran and North Korea continue 
to bolster nuclear programs. “Everybody 
went to sleep for 25 years,” says Charlie 
Nakhleh, Los Alamos’s head of weapons 
physics. “I think we’re awake now.” 


At one Los Alamos facility, x-ray beams are used to image imploding 
mock “pits,” the spheres of plutonium at the heart of nuclear weapons. 


Wilson worries that the international dy- 
namics and the U.S. overhaul could ultimately 
lead to a revival of bomb tests, bringing back 
their hazards and stoking a new arms race. 
“Tt is not unfathomable to me, which is scary 
to say.’ It’s one thing to tweak weapons with 
a deep heritage. It’s another to infer function- 
ality for modified weapons that have never 
been fully tested, he says. 

Weapons physicists at the labs are confi- 
dent they can improve existing weapons and 
design new ones without tests. Their com- 
puter simulations are vastly superior to those 
of the past, and experiments like DAHRT’s 
are more powerful. “Would you design a new 
Formula One car without taking it on the 
track? Or would you design a new Boeing jet- 
liner without flying at first?” asks Rob Neely, 
Livermore’s program director for weapons 
simulation and computing. In the case of nu- 
clear weapons and their plutonium pits, he 
says, the answer appears to be, “Actually, yes.” 


As the simulations and experiments have 
improved, they’ve also revealed gaps in nu- 
clear knowledge, and approximations in the 
codes that haven’t been updated in decades. 
Despite the doubts, Neely brims with confi- 
dence. “Not only will these things work, but 
they’re going to work better.” 


SIMPLY REPLACING the bombs’ plutonium 
pits poses a science challenge: understand- 
ing how subtle changes affect their behavior. 
They aren’t easy to make, in part because 
plutonium, a metal only in existence since 
1940, is mysterious and hard to handle. 
The last time anyone made pits at scale—in 
the 1980s at Colorado’s Rocky Flats plant— 
DOE’s contractor was shut down for envi- 
ronmental violations and forced to pay an 
$18.5 million fine. 

This time, NNSA is splitting pro- 
duction between Los Alamos and the 
Savannah River Site in South Caro- 
lina. It has tasked them with making 
80 new pits per year by 2030, a dead- 
line NNSA admits it will not meet. 

Los Alamos’s pits will be made at 
a facility called PF-4, a set of high- 
security buildings surrounded by 
cyclone fences with razor wire. In- 
side PF-4 are glovebox enclosures— 
radiation-shielded workstations 
where workers use thick gloves and 
peer through glass windows to ma- 
nipulate the exotic metal. The lab is 
hiring thousands of workers, and its 
first pit is likely to be ready for the 
stockpile next year. 

The gargantuan effort is moti- 
vated by a simple fact: many current 
pits are more than 40 years old, and 
plutonium behaves in confounding 
ways as it ages and radioactively de- 
cays. A green, fuzzy coating grows on it as its 
surface oxidizes. Atoms in its metallic lattice 
are knocked out of place as it spits out ura- 
nium isotopes. Its dimensions shift when it 
slips between six different solid phases. And 
the pits do not necessarily degrade smoothly. 
“We know at some point there will be a non- 
linear piece,’ says David Clark, director of Los 
Alamos’s National Security Education Center 
and editor of the Plutonium Handbook. “We 
just haven’t seen it.” 

So far, the silvery spheres seem to be hold- 
ing up. Internal and external assessments 
have vouched for their integrity, suggesting 
the pits could have decades of viability left. 
“We haven’t seen any issues,” Clark says. 

But Jason, a secretive group of physicists 
who advise the government on national se- 
curity matters, raised concerns that galva- 
nized DOE. In a 2019 report, the group urged 
the agency to reestablish pit production 
“as expeditiously as possible” to “mitigate 
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against potential risks posed by Pu aging.” 

One might think the new pits would make 
it easier to certify the stockpile, by avoiding 
the uncertainties of aging plutonium. But 
they come with uncertainties of their own. 
The new pits won’t be twins of their prede- 
cessors, SO weapons scientists will have to 
understand how the alterations change pit 
behavior. They are being manufactured us- 
ing recycled and purified plutonium from old 
pits, not fresh material, unlike the originals. 
Moreover, they will be made with different 
processes, and in some cases designed to 
slightly different specifications. “If you look 
at a new requirement,’ Adams says, “you of- 
ten will find that the old pits we have avail- 
able to us are really, really suboptimal.” 


IN SOME WAYS, understanding the behavior 
of nuclear weapons has grown harder as 
scientists have gotten better at their jobs. 
The higher quality simulations enabled by 
ever more powerful supercomputers, for 
instance, have sometimes revealed new 
problems. This was the case with “boost 
physics,” or the processes at work in the 
first stage of a thermonuclear bomb, where 
fissioning plutonium triggers fusion reac- 
tions in a deuterium-tritium booster, which 
releases neutrons that spark more fission in 
the weapon's pit. 

For a long time, the simulations couldn’t 
reproduce what physicists saw in data 
from underground nuclear tests without 
the application of digital fudge factors. In 
2006, scientists increased the simulations’ 
resolution. “And, lo and behold, we found 
obviously some interesting things that got 
a bunch of people scratching their heads,” 
Neely says. That helped spawn years of 
research in a program called the National 
Boost Initiative, which aimed to understand 
the fundamental physics of thermonuclear 
burn and to incorporate more basic phys- 
ics into simulations, rather than relying on 
calibrations and approximations. 

Pesky approximations rear their fuzzy 
heads throughout the weapons codes, Neely 
says. One is inherent to the nature of the sim- 
ulations. They are all “meshed’”—simulated in 
gridlike parts, like pixels in a digital image. 
Within each mesh element, physical proper- 
ties are assumed to be the same. The mesh 
is getting more refined, but it’s still not a 
precise representation of reality. “You’re just 
able to capture better approximations,’ Neely 
says, “but still an approximation.” 

There’s also fuzziness in the physics that 
governs the simulations. To make the simula- 
tions run more efficiently, scientists often rely 
on math tricks and approximations rather 
than explicit, first-principles solutions. 

Christopher Fryer, head of Los Alamos’s 
Center for Nonlinear Studies, has found 
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Weapon of mass destruction 
Thermonuclear bombs combine fission and 
fusion to boost the explosive yield of a weapon. 
Whereas the fission-only bomb that was 
dropped over Hiroshima, Japan, exploded with 
the force of 15 kilotons of TNT, hydrogen bombs 
in the current arsenal can be about 100 times 
more powerful. 


Warhead 


= — Arming, fuzing 
and firing unit 


Primary stage 

A sphere of explosive charges drives the 
implosion of a plutonium pit to a critical 
density, triggering fission. The fusion of the 
hydrogen isotopes deuterium and tritium 

in the pit's hollow core creates extra neutrons 
that help boost the fission explosion. 


Explosive charges 


Plutonium pit f 


Tritium and deuterium booster Ballistic missle 


Secondary stage 
The x-ray energy released by the 
primary compresses the secondary 
and its uranium sparkplug, triggering 
another fission reaction that heats 
the lithium deuteride fuel and 
ignites fusion. The fusion explosion 
releases many neutrons that fission 
the massive uranium tamper. 


Uranium a ——[o) 


Lithium deuteride 


Warhead 
jacket 
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Uranium sparkplug 


Boom times 
The three U.S. Department of Energy weapons laboratories are getting billions of dollars to upgrade four weapons. 
Anew design program, the W-93, could end up fielding the first new weapon since 1988. 


NAME UPGRADE 

w93 Will be put in service by 2040 and launched from submarines 

W88-Alt-370 An alteration to submarine-launched W88 weapons will replace fuze assemblies, add a 
lightning protector, and replace the conventional explosives. 

W87-1 Areplacement to the land-launched W78, the W87-1 will have enhanced safety features 
and use insensitive explosives. 

W80-4 This weapon will extend the life of the air-launched W80-1. It was engineered with the 
Air Force, which designs the delivery systems. 

B61-12 Areplacement for all four variants of the air-dropped B61 will have maneuverable fins, 


enabling better targeting that will allow designers to reduce the yield. 
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that the weapons codes still contain compu- 
tational tricks conjured up decades ago by 
Manhattan Project luminaries such as Hans 
Bethe and Richard Feynman. “Instead of re- 
lying on them to be the clever people, we’re 
going to have to be clever again,” he says. 

One of Bethe’s recipes, still stirred into 
some fusion simulations, involves the move- 
ment of charged particles. The recipe as- 
sumes that if a particle travels X distance, it 
loses Y energy—a kind of average scattering 
that isn’t always realistic, particularly in reac- 
tions that happen quickly. “It fits the data so 
well until you find out it doesn’t,’ Fryer says. 
Replacing it could mean simulating each 
particle and its particulars—too tough a task 
even for the latest supercomputers. “This is 
why we haven’t done it,” Fryer says. 

But other approximating physics could 
be replaced by better or truer formu- 
las. Los Alamos theoretical physi- 
cist Mark Paris is working on a 
numerical approach to solving the 
nonlinear differential equations that 
pulse throughout the weapons codes. 
“You're actually solving the system of 
equations that govern the system,” 
he says, “not the pastiche of physical 
mechanisms that are approximately 
derived from the system of equations.” 

Simulation can’t be the only tool 
used to understand the bombs, how- 
ever. All humans, even weapons physi- 
cists, are storytellers, Nakhleh says, 
and the simulations help them create 
confident narratives. But that can only 
go so far. “At some point,” Nakhleh says, 
“you have to step into the unknown— 
walk into the dark room, and see, 
‘What did the experiment have to say?” 


THAT IS THE POINT of expensive, high-powered 
efforts like DARHT. To help illuminate the 
inner workings of bomb primaries, Los 
Alamos wants to increase the number of 
DARHT tests per year, currently seven, and 
improve its imaging abilities so it can take 
more x-ray pictures during any given test. 
A second facility, an underground complex 
in Nevada called U1A, is also being revamped. 
It will soon be home to the Enhanced Capa- 
bilities for Subcritical Experiments (ECSE), 
a setup in which scientists will implode real 
plutonium, tiptoeing toward a chain reaction 
without actually triggering one. In ECSE, sci- 
entists will take x-ray pictures of scale-model 
pits as they collapse and investigate how neu- 
trons behave during those crucial instants be- 
fore the nuclear detonation. Because there is 
no nuclear explosive yield, such experiments 
technically adhere to the test-ban treaty. 
Perhaps the most famous experimental 
site is Livermore’s National Ignition Facility 
(NIF), which focuses 192 laser beams onto a 
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thimble-size target containing hydrogen iso- 
topes to spark tiny fusion explosions. NIF cre- 
ates temperatures and pressures that don’t 
exist anywhere else on Earth, says Laura 
Berzak Hopkins, associate program director 
for integrated weapons science at Livermore. 
“These conditions are those of astrophysical 
bodies—the center of Jupiter, the core of the 
Sun,” she says. 

NIF drew headlines in 2022 when it pro- 
duced more energy from the thimble than 
the lasers put in, a milestone relevant to ci- 
vilian efforts to generate fusion power. But 
the achievement came about a decade and 
billions of dollars later than scientists first 
expected. And they still can’t accurately pre- 
dict how much energy they'll get out of a 
given fusion shot. “We don’t know the phys- 
ics,’ Fryer says. That physics is important for 
understanding the fusion components of the 
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weapons—and also for how they would them- 
selves hold up to a nuclear blast, a branch of 
research dubbed “weapon survivability.” 


NO MATTER HOW GOOD the combination of 
theory, simulation, and experiment gets, 
it will probably never fully represent what 
happens in a weapon, Nakhleh says. “Omni- 
science is going to be a ways away,’ he says. 
“The idea is to push that boundary of knowl- 
edge as far as possible.” 

That knowledge isn’t only important for 
maintaining an arsenal. It’s also important 
for broadcasting to the world that the coun- 
try knows the weapons will work. Nuclear 
deterrence—the idea that one country can 
prevent attacks by threatening an attack of 
similar magnitude—only holds up if the other 
country actually finds your threat credible. 

In the era of explosive nuclear testing, con- 
veying that message was simple. Other coun- 
tries could pick up the seismic signal from an 
Earth-shaking blast half a world away. That 


left “no doubt in the minds of our adversar- 
ies or allies,’ Adams says. Without tests, the 
United States has to signal confidence in a 
quieter way. “If you can prove to them you 
understand this physics well enough, you're 
not bluffing,” Fryer says. 

That’s one reason why the national labs 
also work on unclassified, fundamental sci- 
ence that overlaps with weapons science, 
subjects like star formation and supernovae. 
Lab scientists can publish that work, talk 
about it, stick it on a poster at an interna- 
tional conference. In that sense, Fryer says, 
Los Alamos’s Center for Theoretical Astro- 
physics “is a deterrent.” 

Some still doubt, though, whether that 
physics-based storytelling will continue to ad- 
equately substitute for testing as the weapons 
overhaul progresses. “It’s all well and good 
for the engineers to go, ‘Boom, here’s this new 


The Trinity supercomputer at Los Alamos National Laboratory simulates weapons explosions at high resolution. 


warhead. We're super sure it works,” Wilson 
says. That might not be enough certainty for 
the military. At a certain point, Wilson says, 
someone might say that “a cheaper way to do 
this would be ‘Let’s just blow one up.” 

And there’s some political appetite for test- 
ing: Senator Tom Cotton (R-AR), for instance, 
has suggested the country withdraw from 
the test-ban treaty. In 2020, he proposed an 
amendment to the National Defense Authori- 
zation Act that would provide funding to pre- 
pare for potential nuclear tests. It passed the 
Senate, but the House of Representatives’s 
version of the bill prohibited such spending. 

Holding off any push for testing motivates 
Fryer to dig deeper into the physics, he says. 
“For me, it comes down to ‘I don’t want to 
resume testing,” he says. If the alternative is 
understanding the physics better, so be it. 


Sarah Scoles is a journalist in southern Colorado and 
author of a forthcoming book about the 21st century 
nuclear complex. 
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Could you spare an acre for conservation? 


Private landowners must engage in biodiversity conservation 


By Ricardo B. Machado and 
Ludmilla M. S. Aguiar 


iodiversity, including species, ecosys- 
tems, and ecosystem services, is in 
rapid decline worldwide (J), with se- 
vere consequences for human popu- 
lations (2). Several countries have 
established protected areas, the back- 
bone for conserving species and habitats (3). 
However, who should oversee the conserva- 
tion of biodiversity? The government? The 
private sector? Local communities? All of 
them? On page 298 of this issue, De Marco 
et al. (4) present an analysis of the Brazilian 
Cerrado—a vast tropical savanna in central 
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Brazil—that suggests that sharing this re- 
sponsibility with the private sector could rap- 
idly increase international commitments to 
avoid biodiversity loss. 

Protected areas across the world belong to 
different management categories (such as na- 
tional parks, wilderness areas, and managed 
resource protected areas) that cover ~15.8% 
of all terrestrial and inland waters and 
~8.16% of marine environments (5). These 
numbers are far from the 30x30 protection 
target (30% of the planet’s surface protected 
by 2030) that was proposed during the 15th 
meeting of the Conference of the Parties to 
the United Nations Convention on Biological 
Diversity held in Montreal, Canada, in 


December 2022. To achieve this target, the 
193 member countries of the Convention 
should add ~2.685 billion ha to the protected- 
area system. It means that public and private 
areas must be considered (6). The role of 
private lands can be substantial in regions 
where private ownership is dominant, such 
as the Brazilian Cerrado. 

De Marco et al. analyzed data on natural 
vegetation remnants and the potential distri- 
bution of 290 threatened vertebrates (except 
fish) across the Brazilian Cerrado, as well 
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An agricultural field is positioned next to a native 
Cerrado region in Formosa do Rio Preto, Brazil (2019). 
Changes to Brazilian laws about the cultivation 

of private land could help reduce the loss of natural 
vegetation in the Cerrado. 


as the legal requirements in Brazil regard- 
ing the protection of native vegetation. They 
conclude that private areas could hold 15.5 
to 25% of the species present in the region. 
The role of private land in protecting the 
Cerrado’s biodiversity could be made more 
substantial, especially by restoring degraded 
areas. The study suggests that private land- 
owners could keep 15 million to 20 million 
ha of degraded areas located in 
private areas covered by native 
vegetation. Such areas were ex- 
cessively occupied by human 
populations, despite the legal 
obligation of private landowners 
to maintain 20 to 35% of their 


protection (72). This is in contrast with the 
3.09% of the biome’s public area that is for 
biodiversity conservation. 

Convincing landowners to engage in bio- 
diversity conservation is challenging because 
agribusiness profit is much more attractive 
than the few incentives offered by the govern- 
ment. The latter include an exception from 
rural territory tax, payment for water supply, 
and priority in the analysis of rural credit 
applications in official banks. Agribusiness 
activity expansion is continuing to replace 
native areas (73). Although clearing native 
vegetation areas is cheaper than restoring 
degraded areas, it is not sustainable. 


Loss of vegetation in the Brazilian Cerrado 
Devegetation across the Brazilian Cerrado over the terms of five consecutive 
presidents shows that vegetative loss decreased from about 2004 to 2011, and 
then again from 2013 to about 2019. Devegetation has increased since 2019. 


property covered by native vege- lat ee 
tation (7-9). Therefore, restorin enrique Luis Inacio Dilma Michel 
(7-9) 6 Cardoso Lula da Silva Rousseff Temer 


degraded areas in private lands 30 
and halting devegetation are 
two necessary measures to pro- 
mote biodiversity conservation 
outside public protected areas. 
The natural vegetation re- 
moval rate is very high in the 
Brazilian Cerrado. Between 
2002 and 2011, devegetation 
was nearly 1% per year, a value 
2.5 times higher than that for 
deforestation in the Amazon (9). 
This amounts to nearly 1 mil- 
lion ha of loss per year. In 2022, 
the National Institute for Space 
Research (a research unit of the 
Brazilian Ministry of Science, 
Technology and _ Innovations) 
released a report indicating a 
reduction in devegetation in the 
Cerrado after 2013 (10). However, 
over the past 4 years, devegeta- 0 
tion increased to ~1 million ha 2000 
per year in 2022 (see the figure), 
which included privately owned regions. This 
decrease in vegetaion has reduced connec- 
tivity of the Cerrado landscape and, conse- 
quently, has compromised the availability of 
natural habitat for the Cerrado’s species and 
the local dynamic of wild populations (11). 
What is the contribution of private lands 
to biodiversity conservation worldwide? Data 
on protected areas indicate that it is about 
1.41% of the area under protection (5). Thus, 
the participation of private lands, including 
those from nongovernmental institutions, 
could be more robust. About 75% of the 
Cerrado’s 2 million km? is privately owned, 
constituting only 0.06% of the protected area, 
which leaves 74.94% of private areas without 
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New strategies that could prevent further 
degradation by keeping natural vegetation 
inside private areas are needed. This will 
require solving conflicts between diver- 
gent Brazilian laws. The Native Vegetation 
Protection Law (federal law no. 12.651/2012, 
known as the Brazilian Forest Code) allows, 
but does not force, private landowners to 
clear up to 80% of their property to imple- 
ment agricultural activities. Theoretically, 
a landowner could pursue agriculture in a 
smaller portion of the property, leaving the 
rest covered by native vegetation. However, 
a law known as the Agrarian Reform Law 
(federal law no. 8629/1993) indicates that 
unproductive rural private properties that 


2020 


are not economically exploited “properly” 
are subject to expropriation—that is, the 
rural owner must economically explore at 
least 80% of the property. Thus, the land- 
owners choose to remove natural vegeta- 
tion to reduce the risk of losing property 
to the government. Another substantial 
change that is needed is developing a new 
economic model for the Cerrado based on 
the sustainable exploitation of the biodi- 
versity components without extinguish- 
ing them. Payment for ecosystem services 
on private lands in the Cerrado is another 
action that should be expanded and regu- 
lated throughout the biome. Certification 
of rural properties that bal- 
ance agricultural production 
and biodiversity conservation 
is a valid incentive to encour- 
age landowners to conserve. In 
addition, private landowners of 
degraded areas in the Cerrado 
should have financing and 
technical support to recover 
these areas. 

The most important message 
that the decision-makers should 
take home from the study of De 
Marco e¢ al. is that the role of pri- 
vate rural properties in protect- 
ing native species in the Cerrado 
is vital to complement the con- 
servation activities promoted 
within public protected areas. 
Only a combination of public 
and private efforts will achieve 
the international commitments 
to prevent biodiversity loss. 


Jair 
Bolsonaro 
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Tumor suppression by RNA surveillance 


A cyclin-dependent kinase triggers degradation of prematurely terminated RNAs 


By Robert P. Fisher 


yclin-dependent kinases (CDKs) form 
heterodimers with cyclins to control 
cell division and RNA polymerase II 
(RNAPIT)-dependent transcription (7). 
The transcriptional CDK family (CDK7, 
-8, -9, -11, -12, -13, and -19) is the focus 
of recent drug discovery efforts, which are 
premised on the heightened transcriptional 
dependencies of cancer cells (2, 3). On page 
258 of this issue, Insco et al. (4) report un- 
covering a tumor-suppressive role of CDK13. 
Their findings challenge two long-standing 
generalizations—that transcriptional CDKs 
pair monogamously with dedicated cyclin 
partners and that they execute their func- 


Analysis of steady-state RNA levels in 
CDK13”™" human cells revealed accumulation 
of prematurely terminated RNA (ptRNA) 
ending at intronic polyadenylation (IPA) 
sites. The distribution of nascent transcripts 
was largely unchanged, suggesting that in- 
creased stability of truncated transcripts, 
rather than increased frequency of prema- 
ture termination, explained the ptRNA ac- 
cumulation. Consistent with this mechanism, 
in human melanoma cells CDK13 was asso- 
ciated with polyadenylate-binding nuclear 
protein 1 (PABPNI) and zinc finger CCCH do- 
main-containing protein 14 (ZC3H14), which 
interact with the poly(A) tail exosome tar- 
geting (PAXT) complex (6, 7) that normally 
ensures rapid degradation of ptRNA. CDK13- 


cyclin T1 phosphorylated ZC3H14 on Ser*® in 
vitro; this phosphorylation was diminished 
in CDK13”™ cells. Moreover, expression of 
ZC3H14*"4, in which Ser*” is replaced by a 
nonphosphorylatable alanine residue, re- 
duced the interaction of ZC3H14 with PAXT, 
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Loss of RNA quality control drives cancer 
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targeted by PAXT for degradation, which is controlled by CDK13-cyclin 
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(CDK13™*), nuclear RNA surveillance is defective owing to loss of 
ZC3H14 phosphorylation (P) and stabilization of oncogenic, 
polyadenylated ptRNAs that are then translated in the cytoplasm. 
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CDK13 zebrafish. These findings sug- 
gest that ptRNA accumulation is 
intrinsically oncogenic and that a 
nuclear RNA surveillance pathway is 
tumor suppressive and vulnerable to 
genetic ablation. 

It is less clear how ptRNA accu- 
mulation contributes to oncogenesis, 
which did not seem to depend on sta- 
bilization of specific truncated tran- 
scripts or loss of the corresponding, 
full-length mRNAs. Insco et al. pro- 
vide evidence that ptRNA accumula- 
tion can perturb cellular metabolism. 
In CDkK13"™ human melanoma cells, 
ptRNAs were exported to the cyto- 
plasm and translated, leading to the 
production of truncated proteins in 
amounts equaling those of many full- 
length proteins. The aberrant trans- 
lation products contained sequences 
encoded in introns, which could gen- 
erate neoantigens that elicit tumor- 
specific immune responses and might 
be leveraged for immunotherapy (see 
the figure). Whether these truncated 
proteins account for the proliferative 
advantage or increased aggressive- 
ness of cancers deficient in PAXT- 
enforced RNA surveillance requires 
further investigation. In a previous 
study, impairment of this pathway led 
to cytoplasmic transport of normally 
unstable transcripts and global re- 
pression of protein synthesis (7). 

There is precedent for increased 
accumulation of ptRNA arising from 
IPA events in cancer (8). Mutations in 
the gene encoding CDK12—a CDK13 
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paralog that partners with cyclin K (5)—pro- 
mote ptRNA accumulation by increasing 
the frequency of premature termination (9, 
10) rather than acting posttranscriptionally. 
Mutations in CDK12 and CDK13 generate dif- 
ferent ptRNA profiles, with CDK12 mutations 
preferentially affecting genes involved in an 
effective DNA damage response (DDR) (9, 
10). The consequent reduction in full-length 
mRNAs encoding DDR factors is thought to 
underlie defects in homology-directed repair 
and sensitivity to poly(ADP-ribose) poly- 
merase (PARP) inhibitors—a BRCA-like phe- 
notype—of CDK12 mutant cancers (11). The 
findings of Insco et al. raise the possibility 
that ptRNA translation might also contribute 
to oncogenesis in CDK12 mutant tumors. 

CDK12 and -13 share ~92% identity in their 
kinase domains, have similar structures in 
complexes with cyclin K (5, 12, 13), and are 
sensitive to the same small-molecule inhibi- 
tors (14). Selective inhibition of either CDK 
caused perturbations of RNAPII elongation, 
which were more widespread when both ki- 
nases were inactivated (15). Therefore, CDK12 
and -13 are at least partially redundant. The 
requirement of CDK13 in nuclear RNA sur- 
veillance, as uncovered by Insco et al., may be 
unique to CDK13 and specific to CDK13-cy- 
clin Tl. By associating with multiple cyclins, 
CDKs involved in cell cycle control acquire 
different substrate specificities or subcellular 
localizations (1); this may be an example of 
such promiscuity by a transcriptional CDK, 
but whether CDK13-cyclin T1 differs from 
CDK13-cyclin K in substrate specificity or 
functional targeting (e.g., recruitment to 
chromatin or activity in the nucleoplasm) re- 
mains to be tested. It is not necessarily the 
case, however, that ZC3H14 phosphorylation 
must occur posttranscriptionally; it might oc- 
cur during transcript elongation but execute 
its function after transcription terminates. 
As drugs targeting transcriptional CDKs ad- 
vance toward clinical applications, a better 
understanding of CDK function in posttran- 
scriptional RNA metabolism, in both normal 
and cancer cells, is needed. 
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Organization of the ctenophore (Mnemiopsis 
leidyi) nerve net raises questions about 
the evolution of the animal nervous system. 
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Neurons that connect 
without synapses 


The ctenophore nerve net suggests a complex 
evolutionary history of the animal nervous system 


By Casey Dunn 


ong-standing wisdom about the evo- 

lution of the animal nervous system 

posits that all neurons connect to 

each other with synapses and that 

the nervous system arose once in evo- 

lutionary history and was never lost. 
But this tidy picture has seen surprising 
challenges in recent years. On page 293 of 
this issue, Burkhardt et al. (1) provide new 
information on the structure of the ner- 
vous system of ctenophores—marine inver- 
tebrates commonly known as comb jellies. 
These exciting findings further erode this 
traditional view and help build a fascinat- 
ing and more complex understanding of 
nervous system evolution. 

All living animals belong to one of five 
groups. Of these, Porifera (sponges) and 
Placozoa (small, disc-shaped animals) lack 
neurons. Ctenophora (comb jellies) and 
Cnidaria (corals, medusa jellyfish, siphono- 
phores, and others) have nerve nets—ner- 
vous systems with neurons arranged into 
diffuse networks. Bilateria (the group that 
contains most animal species, including ver- 
tebrates, arthropods, and many other inver- 
tebrates) includes some animals with a nerve 
net, but most have a central nervous system. 
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The traditional explanation for this ner- 
vous system diversity is that these organisms 
represent ancestral steps in the increase of 
nervous system complexity. In this scenario, 
sponges diverged first from other animals, 
before the origin of the nervous system (2), 
and nervous system complexity increased in 
a ratchet-like manner in other animals. 

It was unexpected, then, when the first 
Placozoa (3) and Porifera (4) genomes were 
sequenced and found to contain genes that 
were previously thought to be specific to 
nervous system function. A closer look in 
placozoans found that they have gland cells 
that secrete neurosecretory components (5). 
More recently, single-cell expression analyses 
revealed that some sponge cells communi- 
cate through structures that resemble syn- 
apses (6). This has made it clear that differ- 
ent nervous system features, such as neuron 
morphology and neuron signaling molecules, 
have different distributions across animals. 

In parallel, traditional hypotheses about 
the earliest relationships in the animal 
phylogeny have been challenged. Some 
phylogenomic analyses support Porifera 
as the sister group to all other animals (7). 
There is growing evidence, however, that 
Ctenophora is the sister group to all other 
animals (8, 9). The latter indicates that 
some nervous system features arose in- 
dependently in ctenophores or that some 
nervous system components were lost in 
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sensitive to the same small-molecule inhibi- 
tors (14). Selective inhibition of either CDK 
caused perturbations of RNAPII elongation, 
which were more widespread when both ki- 
nases were inactivated (15). Therefore, CDK12 
and -13 are at least partially redundant. The 
requirement of CDK13 in nuclear RNA sur- 
veillance, as uncovered by Insco et al., may be 
unique to CDK13 and specific to CDK13-cy- 
clin Tl. By associating with multiple cyclins, 
CDKs involved in cell cycle control acquire 
different substrate specificities or subcellular 
localizations (1); this may be an example of 
such promiscuity by a transcriptional CDK, 
but whether CDK13-cyclin T1 differs from 
CDK13-cyclin K in substrate specificity or 
functional targeting (e.g., recruitment to 
chromatin or activity in the nucleoplasm) re- 
mains to be tested. It is not necessarily the 
case, however, that ZC3H14 phosphorylation 
must occur posttranscriptionally; it might oc- 
cur during transcript elongation but execute 
its function after transcription terminates. 
As drugs targeting transcriptional CDKs ad- 
vance toward clinical applications, a better 
understanding of CDK function in posttran- 
scriptional RNA metabolism, in both normal 
and cancer cells, is needed. 
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The traditional explanation for this ner- 
vous system diversity is that these organisms 
represent ancestral steps in the increase of 
nervous system complexity. In this scenario, 
sponges diverged first from other animals, 
before the origin of the nervous system (2), 
and nervous system complexity increased in 
a ratchet-like manner in other animals. 

It was unexpected, then, when the first 
Placozoa (3) and Porifera (4) genomes were 
sequenced and found to contain genes that 
were previously thought to be specific to 
nervous system function. A closer look in 
placozoans found that they have gland cells 
that secrete neurosecretory components (5). 
More recently, single-cell expression analyses 
revealed that some sponge cells communi- 
cate through structures that resemble syn- 
apses (6). This has made it clear that differ- 
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morphology and neuron signaling molecules, 
have different distributions across animals. 

In parallel, traditional hypotheses about 
the earliest relationships in the animal 
phylogeny have been challenged. Some 
phylogenomic analyses support Porifera 
as the sister group to all other animals (7). 
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animals (8, 9). The latter indicates that 
some nervous system features arose in- 
dependently in ctenophores or that some 
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sponges. This further challenges the histori- 
cally accepted notion that there has been a 
simple, stepwise increase in nervous system 
complexity through the course of animal 
evolution. 

Burkhardt et al. provide critical under- 
standing about the structure of the cteno- 
phore nerve net that goes right to the heart of 
these questions. The authors report that the 
ctenophore nerve net is unlike the nervous 
systems of other animals. The difference is 
particularly relevant to debates at the dawn 
of neurobiology. In the late 19th century, 
Golgi proposed that the nervous system is 
a syncytial continuum, with the neurons di- 
rectly connected with shared cell membranes 
and cytoplasm (J0). This is known as the re- 
ticulate theory of nervous system structure. 
Ramon y Cajal proposed instead that neu- 
rons are distinct cells (17). This is known as 
the neuronal doctrine, and the discovery of 
synapses seemed to settle the debate in favor 
of this view. 

Burkhardt et al. used serial block face 
scanning electron microscopy to make three- 
dimensional ultrastructural reconstructions 
of a ctenophore subepithelial nerve net. They 
observed that this nerve net is not formed 
by neurons connecting to each other with 
synapses. Instead, the processes of the neu- 
rons are directly fused to each other, forming 
a syncytial continuum. There are synapses 
elsewhere, including where the nerve net 
connects to effector cells, but the subepithe- 
lial nerve net itself is not formed with synap- 
tic connections. 

The findings of Burkhardt et al. suggest 
that Ramon y Cajal’s neuronal doctrine and 
Golgi’s reticulate theory are not universally 
exclusive hypotheses. Most animals with 
nervous systems (cnidarians and bilaterians) 
conform to the neuronal doctrine of separate 
cells that communicate through synapses. 
The subepithelial nerve net of this cteno- 
phore species consists of fused neurons, as 
in Golgi’s reticulate theory. It was never a 
question, then, of whether all animal nervous 
systems conform to the neuronal doctrine or 
the reticulate theory but rather of describing 
which animals conform to which theory. 

There is much that remains unknown 
about the anatomy, physiology, genome bi- 
ology, and natural history of nonbilaterian 
animals. This creates the dual illusions that 
nonbilaterians are simpler than they are 
(because studies tend to focus on bilaterian 
traits they lack rather than the many distinc- 
tive traits that they have) and that these ani- 
mals are more similar to each other than they 
actually are (because superficial similarities 
are often prioritized over clear differences) 
(12). Studies such as that of Burkhardt et al. 
are important for dispelling such illusions. 
For example, ctenophores and cnidarians are 
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both transparent and squishy (they are “jelly- 
fish” in the broad sense), and both have nerve 
nets. Historically, they have therefore been 
placed together in the animal tree of life as 
Coelenterata. This grouping has been taken 
as evidence that Ctenophora cannot be a sis- 
ter group to all other animals. But many ani- 
mals that live suspended in the open ocean 
(as many ctenophores and cnidarians do), 
including some annelids and molluscs, have 
independently converged in soft transparent 
bodies, which indicates that so-called jellyfish 
traits are not good support for Coelenterata 
(12). This leaves the nerve net as one of the 
only traits uniting Ctenophora and Cnidaria 
into the single group Coelenterata. Burkhardt 
et al. show that resemblances of nerve nets 
between Ctenophora and Cnidaria are also 
superficial, and they remove some of the last 
remaining evidence for Coelenterata. 

The findings of Burkhardt et al. help drive 
home the point that the gain of the nervous 
system should not be marked as a singular 
event in the history of animal evolution. 
Instead, the evolution of many constituent 
traits that can together make up a nervous 
system should be considered, including mor- 
phology, molecular inventory, and physiology 
(73). Some animals may have lost nervous 
system components, as may be the case in 
sponges. Sponges are filter feeders, which 
tend to have reduced nervous systems even 
within Bilateria. Other animals may have 
convergently evolved superficially similar 
nervous system features, such as the nerve 
nets of ctenophores and cnidarians, when 
faced with the same functional challenges. 

It is exciting that such fundamental ob- 
servations about animal anatomy, like those 
described by Burkhardt et al., can still have 
such big implications for the study of ani- 
mal evolution. The work of Burkhardt e¢ al. 
shows how much potential lies at the grow- 
ing intersection of comparative morphol- 
ogy, phylogenetics, physiology, and genom- 
ics. Answering the most important open 
questions about early animal evolution will 
require the integration of all these diverse 
approaches and perspectives. 
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Fixing the 
desalination 
membrane 
pipeline 


Materials discovery alone 


has not translated into 
lower-cost water treatment 
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lobal water scarcity is motivating 
the expanded treatment of seawa- 
ter, brackish water, and wastewater. 
Robust treatment trains typically 
include semipermeable reverse os- 
mosis (RO) membrane barriers that 
allow the passage of clean water while re- 
taining the majority (>99%) of salts, dis- 
solved organics, and pathogens. Despite 
considerable research effort to optimize 
membrane chemistry, morphology, and 
module designs for diverse source-water 
and end-use applications, most treatment 
trains deploy RO membrane modules that 
closely resemble those developed for seawa- 
ter desalination over 50 years ago. The en- 
during dominance of these traditional RO 
membranes reveals a broader need within 
the water treatment community to reassess 
the innovation pipeline for membranes for 
desalination and water treatment. 

Past breakthroughs in membrane-based 
processes for desalination and water treat- 
ment were enabled by the joint discovery 
of new materials with desirable separation 
properties alongside manufacturing tools 
for processing these materials into mem- 
branes at scale. The first set of innovations 
in the 1960s combined the high-salt-re- 
jecting properties of cellulose acetate with 
the nonsolvent-induced phase separation 
manufacturing process (1). Twenty years 
later, the discovery of aromatic polyamide 
materials manufactured through interfacial 
polymerization led to the thin-film compos- 
ite (TFC) membrane that delivered 10-fold 
improvements in both water productivity 
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bodies, which indicates that so-called jellyfish 
traits are not good support for Coelenterata 
(12). This leaves the nerve net as one of the 
only traits uniting Ctenophora and Cnidaria 
into the single group Coelenterata. Burkhardt 
et al. show that resemblances of nerve nets 
between Ctenophora and Cnidaria are also 
superficial, and they remove some of the last 
remaining evidence for Coelenterata. 

The findings of Burkhardt et al. help drive 
home the point that the gain of the nervous 
system should not be marked as a singular 
event in the history of animal evolution. 
Instead, the evolution of many constituent 
traits that can together make up a nervous 
system should be considered, including mor- 
phology, molecular inventory, and physiology 
(73). Some animals may have lost nervous 
system components, as may be the case in 
sponges. Sponges are filter feeders, which 
tend to have reduced nervous systems even 
within Bilateria. Other animals may have 
convergently evolved superficially similar 
nervous system features, such as the nerve 
nets of ctenophores and cnidarians, when 
faced with the same functional challenges. 

It is exciting that such fundamental ob- 
servations about animal anatomy, like those 
described by Burkhardt et al., can still have 
such big implications for the study of ani- 
mal evolution. The work of Burkhardt e¢ al. 
shows how much potential lies at the grow- 
ing intersection of comparative morphol- 
ogy, phylogenetics, physiology, and genom- 
ics. Answering the most important open 
questions about early animal evolution will 
require the integration of all these diverse 
approaches and perspectives. 
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Fixing the 
desalination 
membrane 
pipeline 


Materials discovery alone 


has not translated into 
lower-cost water treatment 


By Jeffrey R. McCutcheon! and 
Meagan S. Mauter? 


lobal water scarcity is motivating 
the expanded treatment of seawa- 
ter, brackish water, and wastewater. 
Robust treatment trains typically 
include semipermeable reverse os- 
mosis (RO) membrane barriers that 
allow the passage of clean water while re- 
taining the majority (>99%) of salts, dis- 
solved organics, and pathogens. Despite 
considerable research effort to optimize 
membrane chemistry, morphology, and 
module designs for diverse source-water 
and end-use applications, most treatment 
trains deploy RO membrane modules that 
closely resemble those developed for seawa- 
ter desalination over 50 years ago. The en- 
during dominance of these traditional RO 
membranes reveals a broader need within 
the water treatment community to reassess 
the innovation pipeline for membranes for 
desalination and water treatment. 

Past breakthroughs in membrane-based 
processes for desalination and water treat- 
ment were enabled by the joint discovery 
of new materials with desirable separation 
properties alongside manufacturing tools 
for processing these materials into mem- 
branes at scale. The first set of innovations 
in the 1960s combined the high-salt-re- 
jecting properties of cellulose acetate with 
the nonsolvent-induced phase separation 
manufacturing process (1). Twenty years 
later, the discovery of aromatic polyamide 
materials manufactured through interfacial 
polymerization led to the thin-film compos- 
ite (TFC) membrane that delivered 10-fold 
improvements in both water productivity 
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Materials innovation in a systems and manufacturing context 
Research on emergent membrane materials for reverse osmosis in water treatment trains has often failed to consider the systems-level context or potential for manufacturing 
scalability. Fixing the desalination membrane technology pipeline requires a more holistic approach to materials discovery that addresses contextual and technology gaps. 
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Contextual gap 
Address performance in a module, robustness over 
relevant life cycles, and systems-level impact. 


and salt rejection (2). Forty years later, the 
polyamide TFC membrane remains the gold 
standard RO membrane and is deployed in 
nearly every RO membrane treatment train, 
from the largest seawater desalination 
plant to the point-of-use RO system under 
the kitchen sink. 

Although TFC membranes represented 
a step change for seawater desalination 
membrane performance, they are far from 
a perfect solution for the diverse range of 
nontraditional source waters that are now 
treated. The charge density that makes 
polyamide so effective for rejecting dis- 
solved ions does not translate to efficient 
rejection of small neutral molecules (e.g., 
N-nitrosodimethylamine, urea, or boron) 
that is critical for water reuse (3). Polyamide 
membranes are also susceptible to foul- 
ing and scaling, degrade in the presence of 
oxidative cleaning products, and only with- 
stand moderate pressures (~65 bar) before 
performance degradation or failure. 

Today, desalination system  design- 
ers have engineered process solutions to 
membrane fouling, poor rejection, oxida- 
tive degradation, and low burst pressures. 
Extensive pretreatment systems reduce 
fouling, dechlorination prior to RO reduces 
oxidative damage, advanced oxidation pro- 
cesses destroy poorly rejected organic mol- 
ecules in the permeate, and thermal solu- 
tions are deployed for brine concentration. 
These interventions increase the specific 
energy consumption and the overall level- 
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ized cost (the combined operational and 
amortized capital costs) of desalination and 
water treatment (4). 

Simultaneously, the membrane research 
community has proposed several new classes 
of materials that address these problems. 
Reports of membranes with increased water 
permeability, superb selectivity, fouling resis- 
tance, and chemical tolerance are pervasive. 
One popular approach enhances the proper- 
ties of the polymeric matrix by embedding a 
second, nonpolymeric material with distinct 
transport, charge, or interfacial properties to 
form a “mixed matrix” membrane. Proposed 
fillers include carbon nanotubes, graphene, 
graphene oxide, zeolites, aquaporin, and 
metal organic frameworks. A second ap- 
proach considers materials as alternatives 
to polyamide membranes. These alternative 
materials are intended for use as standalone 
membranes (not mixed matrix) and can be 
categorized into organic polymers and in- 
organic materials. Examples of polymer al- 
ternatives include other condensation poly- 
mers such as polyester (5) and epoxide-based 
membranes (6). Inorganic materials include 
ceramic, carbon molecular sieves, graphene, 
or graphene oxide. Unfortunately, none of 
these materials have yet displaced polyam- 
ide TFC membranes in large-scale applica- 
tions (7). These materials are currently un- 
able to be manufactured at scale; cast into 
defect-free, ultrathin membranes; or rolled 
into modules that are readily integrated into 
treatment skids. 
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Technology gap 
Address manufacturability at relevant 
scales, quality, and price. 


These failures reveal a critical gap in the 
desalination membrane innovation pipe- 
line. Membrane materials discovery, design, 
and synthesis have become divorced from 
membrane manufacturing, membrane com- 
ponent design, and process systems engi- 
neering. Fixing the membrane innovation 
pipeline will require three key interven- 
tions: a strong system-level contextualiza- 
tion during the materials discovery process, 
rigorous methods for testing new materials 
under relevant conditions and life cycles, 
and early-stage screening for manufactur- 
ability at relevant scales (see the figure). 

To contextualize new membrane materi- 
als discovery, the value proposition of a ma- 
terial in a desalination membrane should 
be quantified in the context of both the unit 
process and the entire water treatment train 
(8). For example, aligned carbon nanotube 
membranes with near-frictionless pores 
promised to enhance water permeability 
(9), but membrane productivity is mostly 
limited by poor mass transfer of concen- 
trated salt away from the membrane sur- 
face. In short, the wrong problem was being 
solved. Optimized and efficiently manufac- 
tured membrane spacer designs would have 
been a more productive research avenue for 
addressing low water productivity. Process 
systems optimization models coupled with 
technoeconomic and lifecycle assessment 
are foundational tools for identifying im- 
pactful innovation targets at the unit pro- 
cess and treatment train levels. These tools 
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prioritize innovation needs by quantifying 
the system-level benefits of innovations 
such as membranes with improved organ- 
ics rejection, better chlorine resistance, or 
higher burst pressures. They also allow sci- 
entists to define upper limits on the value of 
membrane performance improvements and 
explore trade-offs in capital and operating 
costs (10). 

Once a system-level need is identified, 
potential material solutions must be tested 
under relevant conditions. Seawater desali- 
nation facilities expect to replace membranes 
on a 5- to 7-year cycle, but few researchers 
assess materials robustness over extended 
durations or repeated cleaning cycles. For 
example, although membrane surfaces modi- 
fied with hydrophilic polymers or nanoscale 
textures might inhibit fouling in a 3-hour 
laboratory experiment on synthetic feed- 
water, testing in real waters over long peri- 
ods of time can yield very different results. 
Organics, extracellular polymeric substances, 
or precipitated inorganic materials quickly 
cover sophisticated chemistries and textures, 
rendering them ineffective or even detrimen- 
tal to performance if they inhibit routine 
cleaning. Successful fouling management 
approaches acknowledge that the membrane 
surface changes the moment the membrane 
is used and instead focus on new methods to 
sense and manage fouling in real time. 

Manufacturability at scale is another 
essential criterion for high-impact desali- 
nation membrane development. Very few 
recently proposed desalination membrane 
materials have scalable or economically vi- 
able manufacturing pathways for serving 
the municipal drinking water market. For 
example, despite many years of effort, man- 
ufacturers have never fabricated inorganic 
materials at a quality, scale, and price point 
that are attractive for desalination applica- 
tions. Furthermore, the impossibility of de- 
fect-free fabrication techniques over square 
kilometer areas renders single-layer inor- 
ganic materials a poor fit for deployment 
in desalination applications. Mixed matrix 
membranes offer some degree of process- 
ability and scalability of a polymer, but con- 
ventional manufacturing processes, such 
as casting or extrusion, make directional 
or location control of the filler material 
challenging. Likewise, emergent polymeric 
materials, such as self-assembled polymers, 
intrinsically porous polymers, and polyelec- 
trolyte complexes, are exciting classes of 
potentially easy-to-manufacture materials, 
but the continued reliance on decades-old 
manufacturing processes for polymers may 
prevent these materials from exhibiting 
their desired properties in a membrane. 
New or modified manufacturing processes 
must be developed to address these weak- 
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nesses while also enabling production rates 
on the order of meters per minute at widths 
of up to 40 inches, the industry standard for 
current desalination membranes. 

There are opportunities to fix these weak- 
nesses in the membrane innovation pipe- 
line for water treatment. Access to design 
tools that help researchers identify high- 
impact materials needs and R&D priorities 
for water treatment membranes would be 
particularly valuable (JO). Collective na- 
tional efforts, such as the National Alliance 
for Water Innovation (NAWI) in the United 
States, have provided baselines for compo- 
nent and unit process costs of diverse treat- 
ment trains, and developing an open-source 
water treatment costing platform promises 
to be one such resource (17). Additional prog- 
ress would be facilitated by full-cost trans- 
parency from the water sectors’ vertically 
fragmented industrial partners. Accurate 
estimates of component, unit process, and 
facility costs and life-spans are needed to 


“The status quo of overselling 

a membrane made at millimeter 
scale and promoting an 

unrealistic narrative...must end.” 


effectively guide academic and government 
agencies toward R&D priorities. 

Access to manufacturing science infra- 
structure is also needed to address these 
weaknesses. Most membrane research hap- 
pens in laboratories that have limited, if any, 
access to membrane manufacturing tools. 
To explore the effectiveness of new materi- 
als in membranes, an understanding of both 
conventional and emergent manufacturing 
techniques (12, 13) applied to these materials 
is needed. To this end, scale-up and prototyp- 
ing facilities for bulk material production, 
membrane manufacturing, and element as- 
sembly are required to support membrane 
researchers’ attempts to address manufac- 
turability throughout the research, devel- 
opment, demonstration, and deployment 
(RDD&D) pipeline. The US Department 
of Energy is investing in manufacturing 
demonstration infrastructure through its 
national laboratory facilities that will sup- 
port this pipeline and the membrane manu- 
facturing ecosystem. Roll-to-roll manufac- 
turing tools are currently available at the 
National Renewable Energy Laboratory, Oak 
Ridge National Laboratory (Manufacturing 
Demonstration Facility), and Argonne 
National Labs (Materials Engineering 
Research Facility). Similar facilities out- 
side of the United States are supported in 
Singapore at the Separation Technologies 


Applied Research and Translation Centre 
and the Singapore Membrane Technology 
Center at Nanyang Technological University. 
Such capabilities also exist in Australia at 
the Commonwealth Scientific and Industrial 
Research Organisation (CSIRO). In paral- 
lel, historic heavyweights in the membrane 
manufacturing space could more openly ar- 
ticulate manufacturing requirements, and 
experienced scientists from the roll-to-roll, 
extrusion, printing, polymer processing, and 
nanomaterials application areas could lend 
their expertise to water treatment mem- 
brane manufacturing. Such facilities and 
partnerships come with increased cost of re- 
search. This requires funding agencies to ad- 
equately support membrane manufacturing 
innovation and incorporate manufacturing 
research infrastructure and other systems- 
level experts when possible. 

The status quo of overselling a membrane 
made at millimeter scale and promoting 
an unrealistic narrative about how it will 
change water treatment must end. Such 
efforts lead to incremental improvements 
in membrane performance and distract 
researchers from exploring manufacturing 
science and process systems-level questions 
that would yield more impactful results. 
Repairing the desalination membrane in- 
novation pipeline starts by grounding ma- 
terials discovery research in systems-level 
performance improvements, manufactur- 
ability, and life-cycle performance. Doing so 
would substantially improve the membrane 
science community’s focus on applied re- 
search. 
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William Steffen (1947-2023) 


Earth system scientist with passion for people and planet 


By Johan Rockstrém! and 
Katherine Richardson? 


illiam (“Will”) Lee Steffen, influ- 
ential Earth system scientist who 
quantified human impacts on the 
global environment and shaped 
public understanding of the plan- 
etary crisis, died on 29 January. 
He was 75. Will’s critical insights contributed 
to our current understanding of Earth as a 
complex adaptive system characterized by 
interactions among a range of biogeochemi- 
cal and physical components, and he warned 
that modern society risks exceeding the lim- 
its of planetary stability. At numerous organi- 
zations focusing on global change research, 
he spearheaded interdisciplinary worldwide 
sustainability initiatives, established bridges 
between climate science and policy, and com- 
municated results with the public. 

Born on 25 June 1947 in the tiny town of 
Clearwater, Nebraska, and raised in Spencer, 
Iowa, Will completed his bachelor’s in chemi- 
cal engineering at the University of Missouri 
in 1970 (supported by an oil company schol- 
arship, a fact he later found ironic given his 
focus on anthropogenic climate change). 
He then moved to the University of Florida, 
where he received a master’s degree in educa- 
tion in 1972; met and married his wife, Carrie; 
and—after a US Peace Corps tour to the Fiji 
Islands—earned his PhD in chemistry in 
1975. Will did a postdoc at Cornell University 
and then accepted a research fellowship in 
chemistry in 1977 at the Australian National 
University (ANU) in Canberra. 

In 1980, Will joined the Commonwealth 
Scientific and Industrial Research Organi- 
sation in Canberra. His work measuring land- 
atmosphere CO, flux laid the foundations 
for his systems approach to_ global 
environmental change. In 1990, he moved to 
the International Geosphere-Biosphere Pro- 
gramme (IGBP). From 1998 to 2004, based 
in Stockholm, he served as IGBP’s executive 
director. Returning to Canberra in 2004, Will 
served as a science adviser to the government 
until 2011. He became director of ANU’s 
Fenner School of Environment and Society 
in 2007 and served as the inaugural director 
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of ANU’s Climate Change Institute from 2008 
to 2012. Will’s vast experience provided him 
with an extensive network across scientific 
disciplines, and he propelled Earth system 
science forward by bringing together various 
lines of research, all focusing on different 
aspects of global environmental change. 

In 2011, Will was appointed to Australia’s 
Climate Commission, established by the gov- 
ernment to communicate climate science to 
the public. When an administration skeptical 
of climate change dissolved the commission 
in 2013, Will—who was not afraid of con- 
frontation where warranted—and his fellow 
commissioners launched the independent or- 


ganization Climate Council, formed and sup- 
ported through a crowdfunding campaign. 
Humans’ potential to change the environ- 
ment at the planetary level fascinated Will. 
At a 2000 IGBP meeting, when chemistry 
Nobel laureate Paul Crutzen exclaimed in 
frustration that we are no longer living in 
the Holocene but in the “Anthropocene,” 
Will encouraged him to publicly suggest the 
name change. Eager to understand when 
this change in geological epoch might have 
occurred, Will led a project that compiled 
the first-ever time series of human pressures 
on the planet. The data—recorded in Global 
Change and the Earth System: A Planet Under 
Pressure, the now-classic 2005 volume in an 
IGBP series—revealed the exponential rise in 
global environmental change, establishing the 
“sreat acceleration” of the effects of humans 
on the environment beginning in the 1950s. 
Will was also a member of the Anthropocene 
Working Group of the Subcommission on 
Quaternary Stratigraphy that voted in 2019 


x 


Chec 


to introduce the Anthropocene as a new upd 


logical epoch. 

Although based in Australia, Will main- 
tained strong links to Europe, including af- 
filiations at the Stockholm Resilience Centre 
and the Potsdam Institute for Climate Impact 
Research, allowing us to work together closely. 
His personality was an asset when building 
bridges between scientific disciplines and 
negotiating complex issues. In 2009, Will, 
the two of us, and colleagues developed the 
Planetary Boundaries framework, which de- 
fines a quantitative safe operating space for 
humanity on Earth by integrating the science 
on tipping points with our understanding of 
Earth system stability. Will later led the 2015 
scientific update of the framework, which 
showed threats to climate and biosphere 
boundaries (including biodiversity, land use 
change, and nutrient loading). 

Concerned that anthropogenic impacts 
could cause a loss of Earth’s resilience, in 
2018 Will invited a diverse group of schol- 
ars to address the question of how the Earth 
system might respond if human activities 
resulted in a 2°C increase in global average 
temperature. The group warned that anthro- 
pogenic warming could trigger a warming 
cascade. This “Hothouse Earth” hypothesis 
has generated substantial scientific debate. 

Recognizing that neither Earth’s resources 
nor its limits are distributed equitably, Will 
was committed to Earth system justice. He 
called for legal frameworks acknowledging 
that all people are connected and interdepen- 
dent and that collective actions are therefore 
required to safeguard resources such as ice 
sheets, forests, the ocean, and biodiversity. 
The Planetary Boundaries also show the need 
for a fair distribution of remaining ecological 
space and the moral imperative of securing a 
livable Earth for future generations. These di- 
mensions of Earth system justice prompted a 
proposal to establish a United Nations char- 
ter recognizing a stable planet as a “common 
heritage for humankind,’ an effort in which 
Will was heavily engaged. 

Will’s immersion in and devotion to his 
adopted country were complete. He became a 
citizen and voraciously studied the country’s 
natural history and the history of its 
Indigenous peoples. An avid hiker, he reveled 
in climbing Australia’s hills. 

We will remember Will as an extremely 
likable person and cherished friend who 
combined scientific excellence with humility 
and sincerity. He was a strong supporter of 
a “whole Earth system” approach in science 
and policy, integrating people and planet. 
Through his work, he not only provided 
novel Earth system insights but also strove 
to identify potential pathways for humanity’s 
future on Earth. & 
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POLICY FORUM 


RESEARCH SECURITY 


Managing United States-China 
university relations and risks 


US universities must proactively address potential concerns 


By Richard Lester’, Lily Tsai?, Suzanne 
Berger’, Peter Fisher®, M. Taylor Fravel?, David 
Goldston‘, Yasheng Huang', Daniela Rus® 


he intensifying geopolitical rivalry 

between the United States and China 

is clouding the outlook for cross- 

border academic exchange and col- 

laboration in science and technology. 

Technological competition is a prin- 
cipal focus of this rivalry, and pressures are 
building in both countries to erect higher 
barriers to academic research collabora- 
tions and to restrict the flow of students 
and scholars between the two countries. A 
major challenge for US universities is how 
to manage these pressures while preserving 
open scientific research, open intellectual 
exchange, and the free flow of ideas and 
people. New federal regulations designed 
to strengthen research security on US uni- 
versity campuses are now being introduced. 
Yet federal policies, no matter how well 
crafted, cannot be a substitute for actions 
by universities themselves. We share an 
approach developed at the Massachusetts 
Institute of Technology (MIT) to make clear 
the lines that should not be crossed and the 
principles that should govern academic re- 
lations with China. 

US research universities have long ben- 
efited from their ability to attract some of 
the world’s most talented students, schol- 
ars, and innovators, many of them from 
China. Now, as government officials con- 
front the immediate challenges posed by 
the Chinese leadership’s coercive actions at 
home and around the world, US research 
universities must prepare for an extended 
period of adversarial relations and poten- 
tial conflict between the United States and 
China. Can the values that underpin the 
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excellence of US research universities sur- 
vive the struggle with China? What are the 
nation’s goals for these universities? And 
what role should the universities them- 
selves play in shaping the course of aca- 
demic relations with China? 

Similar questions confront governments 
and universities elsewhere. Amid concerns 
about the risks to national security and 
academic freedom, UK universities have 
been warned about their “strategic depen- 
dency” on Chinese partnerships (1). The 
European Commission has recently pub- 
lished a “toolkit” to help universities miti- 
gate foreign influence in research and in- 
novation (2). In Japan, the government has 
introduced new rules that require security 
reviews before universities can accept for- 
eign students and researchers (3). The G7 
governments recently declared their inten- 
tion to work together to enhance research 
security without undermining academic 
freedom and open science (4). 

Yet governments should not lose sight of 
longer-term domestic and global interests 
in research and innovation. For example, 
senior Biden administration officials have 
recently underscored the importance of at- 
tracting talented Chinese students to the 
United States (5). In 2019, 41% of all science, 
technology, engineering, and mathemat- 
ics (STEM) PhDs graduating from US uni- 
versities were temporary visa holders, with 
China accounting for more of these gradu- 
ates than the next nine foreign countries 
combined (6). Most Chinese PhD gradu- 
ates stay in the United States, helping to 
advance US research and innovation (7). 
But rising United States-China tensions as 
well as US border control enforcement and 
well-publicized investigations into alleged 
campus intellectual property theft may be 
affecting Chinese graduate student applica- 
tions and enrollment in STEM programs at 
US schools. Although the overall impact of 
these factors is unclear, some United States- 
based faculty report that top-rated STEM 
students at Chinese universities, who in pre- 
vious years would have applied to graduate 
departments at leading American universi- 


ties, are instead choosing to stay in Cl vies 

United States-China tensions are ‘aisu— 
affecting cross-national collaborations be- 
tween US and Chinese researchers—in re- 
cent years by far the most important axis of 
international collaboration for US research- 
ers when measured by jointly authored pub- 
lications (8). Going forward, US researchers 
will presumably have even more to gain 
from such collaborations as China’s invest- 
ment in science and technology continues 
to grow rapidly (9). But some US faculty 
who have previously collaborated with col- 
leagues in China report that they are now 
holding back from joint research. 

Faculty and student concerns over rising 
bilateral tensions have been aggravated by 
a series of arrests and failed prosecutions 
of Chinese-origin university researchers 
accused of enabling scientific espionage. 
These actions have prompted accusations 
that the US government is criminalizing 
normal scientific and academic exchange 
and engaging in racial stereotyping (JO). 
Perceptions of bias and discrimination have 
helped convince many outstanding young 
Chinese scientists at US universities to pur- 
sue their careers in other countries (77). 

The strained relations between the US 
government and the US academic commu- 
nity add to the importance of self-initiated 
efforts by universities, reflective of their in- 
stitutional values. Some China-related chal- 
lenges are in any case better addressed by 
the academic community itself, and univer- 
sity actions—such as upgrading campus re- 
search security, identifying clearly the kinds 
of interactions with China that should be 
out of bounds, and establishing processes 
for deciding on difficult cases—can also 
build confidence among policy-makers and 
may help to avoid federal government over- 
reach in policies and regulations. 

To be clear, there is an urgent need for an 
integrated government policy framework 
for academic relations with China that ad- 
dresses immigration, research security, 
and research collaboration. But universi- 
ties should develop their own policies, pro- 
cesses, and risk management frameworks, 
informed by their deeper knowledge of 
educational and research practices and in- 
stitutional values and shaped by their role 
as guarantors of the intellectual autonomy 
of their faculty. 

Even as the economic and military ri- 
valry between the United States and China 
continues to build, US research universi- 
ties and the nation more broadly can ben- 
efit from academic exchange with China. 
Ending academic relations would weaken 
the foundations of US science, technology, 
and innovation and would harm US eco- 
nomic development and national security. 
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LINES THAT WILL NOT BE CROSSED 

In 2021, the authors of this article were 
asked by MIT President L. Rafael Reif to 
chart a path for MIT’s future relations with 
China. The resulting approach (12), now be- 
ing implemented, is designed to help MIT 
advance knowledge and the needs of the 
United States and the world—without dam- 
aging US interests in national security or 
the economy, without endangering human 
rights, and in ways consistent with the core 
values of our institution. 

In developing this approach, our group 
consulted extensively with experts in aca- 
demia and government both in the United 
States and internationally, as well as with 
many members of the MIT community. We 
assumed that the complex and challenging 
international security environment will per- 
sist and that relations between the 
United States and China may dete- 
riorate further. We recognized that, 
although continued academic rela- 
tions with China will bring benefits, 
engagement brings its own risks, 0 
and that new approaches to manag- 
ing these risks are needed. 

We take seriously the concern 
that the Chinese government—and 
some other foreign governments— 
are targeting US university research 
and technology to gain advantage, 
mostly through legal means but 
sometimes illegally or improp- . 


managed. For example, federal regulations 
require principal investigators (PIs) sup- 
ported by US government funding agencies 
to demonstrate that their work for the gov- 
ernment is adequately protected against 
theft and to disclose international collabo- 
rations that are related to that work. But 
regulatory compliance is often not enough 
to determine whether the proposed activi- 
ties should be undertaken at all. MIT’s el- 
evated risk review process provides guid- 
ance on proposed activities that would not 
violate federal rules but nevertheless re- 
quire careful assessment of risks and bene- 
fits to determine whether they should pro- 
ceed. An important aspect of the process 
is to consider the risks of not undertaking 
proposed engagements as well as the risks 
of doing so. 


Lines that will not be crossed 


The Institute and its faculty are called: 


Not to host researchers and visiting students who 


are known to be employed by Chinese military and 
security institutions or who are graduates of China’s 


civilian national defense universities 


Not to enter into research collaborations with 
China's civilian national defense universities, 


military research institutes, or national defense key 


laboratories at civilian universities 


Not to enter into relationships with Chinese corporate 


erly. We also take seriously the ob- 
stacles to academic collaborations 
presented by Chinese government 
policies that restrict academic au- 


or other entities that are known to provide systems, 
products, or services with military applications to the 
Chinese armed forces, or for which there is credible 
evidence that their activities are contributing to the 


tonomy on Chinese university cam- 

puses, increase the risk of seizure 

of intellectual property deemed 

to be in China’s national inter- 

est, and attempt to exert influence 

over Chinese students and scholars 

in the West. We recognize too that 

when researchers at US universities col- 
laborate with individuals or institutions in 
countries with authoritarian or autocratic 
governments, the good intentions of their 
direct collaborators are not enough to as- 
sure good outcomes. 

Our new strategy is the latest stage in 
the ongoing development of policies for 
China at MIT that began several years ago. 
An important milestone occurred in 2019, 
when a process was introduced for case-by- 
case reviews of all proposed China-related 
research, educational, and other formal 
engagements, as well as engagements with 
certain other countries, from the perspec- 
tive of risks to national security, civil and 
human rights, and economic competitive- 
ness (13). In some cases, government regu- 
lations prescribe how these risks should be 
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suppression of human rights in China 


Not to participate in Chinese talent recruitment 
programs that are designed to transfer US 
technology to China 


The reviews involve faculty and admin- 
istrative committees, and the process is 
coordinated by the Office of the Provost 
(the Associate Provost for International 
Activities). The toughest cases are referred 
to a small group of senior administrators, 
and decisions are necessarily based on 
judgment rather than precedent or stan- 
dard rules. For research collaborations, 
decisions are made with the active partici- 
pation of the PIs, who typically have the 
best understanding of the benefits of the 
proposed collaboration for research as well 
as the technical capabilities of their part- 
ners. A key role of the PIs is to describe 
the benefits that all the participants in the 
proposed collaboration might expect to 
realize, as well as potential benefits to the 
nation and the world. 


But Pls are generally not as well informed 
about national security and human rights 
risks. Information is also needed to under- 
stand the context in which the potential 
Chinese collaborators are operating, in- 
cluding the ways in which organizations 
and individuals in China are connected to 
the Chinese government or the Chinese 
Communist Party and the obligations they 
have to them. Again, PIs usually do not have 
this information and do not know where 
to get it. The process relies on inputs from 
country and regional experts at MIT and 
elsewhere; from MIT’s Washington, DC, of- 
fice; and occasionally from ad hoc faculty 
committees that may be convened for advice 
on difficult cases. As a result of this process, 
some proposed engagements have been re- 
jected, many have been approved, and for 

others, specific conditions have been 
applied or modifications required— 

in some cases, to ensure reciprocity. 
The new strategy goes beyond 
these ex ante risk assessments and 
covers all aspects of MIT’s interac- 
tions with China, including research 
security on campus, informal col- 
laborations, the appointment of 
postdoctorates and visiting scien- 
tists, and executive and professional 
education. It provides practical 
guidance to the MIT community 
on these and other issues within a 
framework defined by the core mis- 
sion, goals, and values of MIT. The 
principal goals include ensuring 
that all members of the MIT com- 
munity, including those of Chinese 
origin, can thrive and do their best 
work without fear of external in- 
fluence, bias, or discrimination; 
enabling our faculty, staff, and stu- 
dents to work with leading Chinese 
researchers and _ institutions on 
problems that are important to both 
countries and to the world; and educating 
our students about Chinese science, tech- 
nology, innovation, business, history, cul- 
ture, politics, and economics—knowledge 
whose benefits to our students, and more 
broadly to the United States, will only grow. 

The new strategy also describes lines 
that should not be crossed in MIT’s engage- 
ments with China (see the box). Other guid- 
ance for the MIT community covers tech- 
nology licensing, data protection, and travel 
to China. Regarding upgrades to campus 
research security, although PIs are gener- 
ally responsible for ensuring that all mem- 
bers of their research groups understand 
the norms and expectations concerning the 
sharing of information outside the group, 
the university should provide training and 
other guidance to help PIs with these tasks. 
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MAPPING SPACES FOR COLLABORATION 
Many academic leaders in the United States 
and Europe have expressed interest in our 
report, and some are developing and imple- 
menting approaches of their own. A challenge 
for all of them, even those with extensive in- 
house specialist knowledge, is how to acquire 
the information needed for risk assessments, 
especially concerning China’s policies, regu- 
lations, and practices in research, education, 
and innovation. There are opportunities 
for universities to work together to develop 
shared information resources. They may also 
soon be able to consult a new unclassified 
information service under development by 
the National Science Foundation that covers 
foreign research partners and projects that 
could pose security risks (14). 

By helping to map out spaces for pro- 
ductive exchange and collaboration, these 
measures by universities will assist their 
own faculty, who otherwise will be less and 
less inclined to pursue connections with 
China. Each institution needs to develop 
an approach adapted to its own culture 
and internal processes. But there is value 
in cooperation, and MIT is working closely 
with university associations such as the 
Association of American Universities and 
the Association of Public and Land-grant 
Universities on this issue. 

Although motivated by the particular 
context of China, many of these issues and 
approaches are also important for systemat- 
ically managing collaborations that involve 
a broader range of countries in what is an 
increasingly complex and dynamic interna- 
tional environment. Again, the likely ben- 
efits must be clearly identified and the risks 
managed effectively, and for most universi- 
ties, this will entail the development of new 
risk management capabilities. 

Alongside this risk management agenda, 
US research universities must also advo- 
cate effectively to preserve two key factors 
that have contributed to their global lead- 
ership but are now themselves at risk as 
United States-China tensions mount. First 
is the ability to continue admitting the best 
and most promising students in all disci- 
plines from around the world. When these 
people come to the United States to study 
and work, it strengthens US research and 
education immeasurably; investing in these 
students builds trust, friendships, and con- 
fidence in US research and innovation that 
can last a lifetime. Ensuring the greatest 
possible access for individuals of great abil- 
ity, regardless of nationality, is the strategy 
that will allow US universities to produce 
the highest benefits for the United States 
and for people everywhere. 

Second, the system of open scientific 
research that is the foundation of knowl- 
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edge, education, and innovation in US re- 
search universities must be sustained and 
strengthened. Restricting scientific dis- 
course stymies scientific progress by pre- 
venting researchers from building on and 
challenging each others’ work. Erecting 
barriers around specific areas of academic 
research will deny the United States, as well 
as others, the benefits that result from sci- 
entific progress. Although special precau- 
tions may be needed in some new areas of 
research because of national security con- 
cerns, the United States has more to lose 
than to gain if sweeping restrictions on the 
conduct and publication of academic re- 
search are implemented. 

To that end, we urge federal agencies to 
avoid blurring the distinction between open 
and unrestricted research. For example, 
they should exercise caution in expanding 
the reach of the Controlled Unclassified 
Information program (75), which estab- 
lishes an intermediate category of govern- 
ment-owned information that is unclassi- 
fied but subject to extra safeguards. Federal 
agencies should also give up the growing 
practice of requiring universities to apply 


“In the current geopolitical 
environment, there is a risk of 
self-inflicted damage...” 


nationality or national origin criteria to de- 
termine who should be permitted to work 
on their research projects. The government 
can and should vet which individuals are 
admitted to the United States, but once ad- 
mitted, they should be able to participate in 
any unclassified research project, except if 
participation would violate export controls 
(which restrict the transfer of certain sen- 
sitive technological information to foreign 
persons in the United States). Preventing 
certain members of university communities 
from working in specific research fields or 
from studying particular academic subjects 
because of their nationality is deeply prob- 
lematic and corrosive. 

In the current geopolitical environment, 
there is a risk of self-inflicted damage to the 
principles of openness, tolerance, and non- 
discrimination that differentiate the United 
States from its rival. In this difficult envi- 
ronment, US universities have an important 
role in articulating and defending the val- 
ues that have enabled them to flourish, in 
welcoming excellent Chinese students and 
scholars to their campuses, and in enabling 
their faculty and students to work safely 
with Chinese peers on shared intellectual 
challenges. Even if the overall trend in rela- 


tions between the two countries is toward 
less rather than more engagement, there 
are important areas of research and educa- 
tion in which the academic community, the 
nation, and the world would be better off 
with more rather than less United States- 
China scientific collaboration. 

US research universities should now work 
to establish a comprehensive, ongoing dialog 
with the federal government on the China is- 
sue. Universities should be proactive in ad- 
dressing new problems as they arise, while 
vigorously advocating for themselves as insti- 
tutions with values to uphold and with value 
to provide to the nation and the world. 
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Gun gatekeepers and 
their philosophies 


Individualism, conspiracism, and partisanship 
fuel American gun culture, argues a sociologist 


By Matthew Lacombe 


n Merchants of the Right, Jennifer Carlson 

takes on a topic of crucial importance: 

the relationship between conservative 

gun culture and the core commitments of 

American democracy. Along the way, she 

sheds fascinating new light on the fac- 
tors that galvanized the largest gun-buying 
spree in the country’s history in 2020 and 
shaped how many Americans responded to 
the tumult of that year. Balancing 
engaging prose with a sober tone, 
Carlson shows how American gun 
culture has developed over the 
past 50 years, articulates the “civic 
toolkit” it provides its adherents, 
and vividly describes how that 
toolkit was applied during a year 
marked by pandemic, protests, 
and contentious politics. 

Carlson relies on interviews 
with 50 gun sellers located in di- 
verse areas of the country. Con- 
ducted between April and August 
2020, these interviews provide in- 
sights into how gun sellers made 
sense of the year. Carlson’s use of gun sellers 
as a window into gun culture during the pan- 
demic is astute and clever; these individuals 
are not only enmeshed in gun culture, they in 
many ways act as its gatekeepers. 


The reviewer is at the Department of Political Science, Case 
Western University, Cleveland, OH 44106, USA. 
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Merchants of the Right: 
Gun Sellers and 
the Crisis of American 
Democracy 
Jennifer Carlson 
Princeton University 
Press, 2023. 288 pp. 


Carlson’s interviews focus on how adher- 
ents of the “dominant strains of gun rights” 
(a phrase used to clarify that not all gun 
owners hold this outlook) conceive of Ameri- 
can civic life and democracy—an approach 
she argues has come to be reflected in con- 
servative politics writ large. The book con- 
tends that this culture provides individuals 
with a civic toolkit consisting of three inter- 
related tenets, which inform their political 
outlooks generally and their reactions to the 
pandemic specifically. 

The first is armed individu- 
alism. Conceiving of guns as 
sources of personal protection 
and political empowerment, this 
tenet simultaneously views guns 
as helpful in the face of govern- 
ment underreach and as security 
against government overreach. 
Central democratic questions 
about collective responsibilities 
are boiled down to a belief that 
everyone is ultimately responsible 
for protecting themselves. Applied 
to the events of 2020, gun buying 
was a way to overcome the gov- 
ernment’s failure to provide physical safety 
and economic stability—to deal with the per- 
ceived chaos of the year—as well as protec- 
tion against attempts by the government to 
use the crisis to seize control. 

The second part of Carlson’s framework is 
conspiracism. Animated by a populistic skep- 
ticism of experts, this tenet emphasizes that 
individuals need to come to their own con- 


¢ 


Chec 


Potential buyers examine guns for sale at upd 


The Nation's Gun Show in Chantilly, Virginia. 


clusions in light of the misleading informa- 
tion those in power are said to provide on be- 
half of their own agendas. Conspiracy beliefs 
provide a way to assert control by explaining 
otherwise intractable conditions—an asser- 
tion reinforced by gun ownership. We are all 
familiar with how this line of thinking mani- 
fested in 2020: Conspiracy theories about 
the pandemic, racial justice protests, and the 
election proliferated, as did gun buying—all 
in response to a combination of threats that 
left many Americans feeling anxious. 

The final tenet is an intense form of par- 
tisanship that encourages the denigration of 
opponents and even questions their demo- 
cratic worthiness. Gun ownership, as a po- 
litical identity, is attached to a wide range of 
conservative beliefs and (to an even greater 
extent) animosity toward liberals, who not 
only threaten gun rights but also are seen as 
irrational, irresponsible, and insufficiently 
independent. Liberal-coded measures—like 
mask wearing—should be rejected, faith that 
Democrats play fair in elections should be 
low, and, in the most extreme case, insurrec- 
tion may be acceptable. 

On the topic of partisanship, Carlson 
also explores how gun sellers’ civic toolkit 
shaped their reactions to 2020’s first-time 
gun buyers, who disproportionately came 
from sociopolitical demographics outside of 
the conservative, white, male center of gun 
culture. Were gun sellers—the culture’s gate- 
keepers—accepting of these new, often liberal 
gun owners? The answer, by and large, is no; 
gun sellers viewed many liberal gun owners 
as “unfit for Second Amendment rights,” hyp- 
ocritical, and in need of political reeducation. 

“Gun sellers enlisted partisanship...to 
parse out who could claim the identity of 
gun owner and all that it implied politically, 
socially, even morally,’ writes Carlson. New 
gun owners would only be accepted into the 
community if they adhered to all three of its 
core tenets, thereby putting sharp boundar- 
ies around membership in it—boundaries 
that are paradoxically harmful to gun sellers 
materially and, at least in some ways, harm- 
ful to the gun rights cause politically. 

Through an examination of conservative 
gun culture, Carlson helps explain the rise, 
spread, and consequences of armed right- 
wing populism and the anti-system conspir- 
acy beliefs and extreme partisanship associ- 
ated with it. She shows not how gun culture 
has been mobilized on behalf of gun rights, 
but instead how its adherents view and prac- 
tice democracy—and how those views and 
practices have influenced conservative poli- 
tics more broadly. 

10.1126/science.adh2084 
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The era of the intoxicated 


experimentalist 


A historian revisits psychoactive 
self-experimentation’s 19th-century heyday 


By Lucas Richert 


n 1912, the amateur scientist and medi- 
cal historian Victor Robinson analyzed 
the effects of hashish on his friends, 
colleagues, and himself and concluded 
that cannabis held value as a medi- 
cal tool and was generally safe. But he 
also believed that developing rock-solid 
evidence around potential drugs and medi- 
cines was a fraught business. “Old drugs, 
like old folks,’ he proclaimed in “An Essay 
on Hasheesh,” published in Medical Review 
of Reviews, “must give way to the new, and 
even the therapeutic master-builders must 
beware when the young generation of heal- 
ing-agents knocks on the door of health.” 
With Psychonauts, author Mike Jay, who 
is among our finest big-picture analysts and 
popular historians of global intoxicants, 
colorfully contextualizes how  individu- 
als like Robinson self-experimented with 
potential “healing agents.” Over 300-plus 
pages, Jay offers the reader an occasion- 
ally uncomfortable bird’s-eye view of what 
it took to self-assess the risks, efficacy, and 
mind-expanding properties of all manner 
of substances—and how “psychonauts” 
like Robinson navigated the roiling seas of 
pharmacological uncertainty, commercial 
imperatives, and the pitfalls of addiction. 
The word “psychonaut” harkens to the 
1950s and refers to explorations of the mind 
and altered states of being, deriving from the 
roots of “psyche” (spirit or mind) and “nautes” 
(sailor); yet Jay accurately points to the much 
longer tradition of self-experimentation. His 
analysis, set largely in the 19th century in 
Euro-America, appropriately carries forward 
into the 20th, terminating in the mid-1970s. 
In this way, Jay connects Alexander “Sasha” 
Shulgin with Sigmund Freud, Humphry 
Davy, Thomas De Quincey, William James, 
and other renowned psychonauts, providing 
a throughline from the past to the present. 
Jay is on familiar and comfortable ter- 
ritory with Psychonauts. In earlier books, 
Emperors of Dreams and High Society, Jay 
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exposed readers to the entanglements of 
drugs, science, and culture during the 19th 
century. More recently, Mescaline: A Global 
History of the First Psychedelic charted pey- 
ote’s journey across the planet as an object 
of ritual, religion, and biomedicine. Jay’s 
Psychonauts similarly blends social and bio- 
medical history; the result is a wide-ranging 
and lavishly illustrated account of multiple 
drugs and self-experimentation that lacks 
the paint-by-numbers trappings of many 
academic monographs. 


Figuier, depicts a vivid ether-induced hallucination. 


The scope of scientific practice is a ma- 
jor feature of Psychonauts. How is the sci- 
ence of drugs conducted? Who is doing it? 
What counts as expertise? To address these 
questions, over the course of eight chap- 
ters, Jay identifies how individuals from 
all walks of life tested the effects of various 
mind-altering agents on themselves, and 
he explores why the methodology of self- 
experimentation (turning the researcher 
into both observer and subject) ultimately 
matters in the history of science. 

Objective science, we learn, was a prod- 
uct of professionalization and nascent tech- 
niques in the mid-1800s, and this so-called 
progress recalibrated expectations about 
evidence around intoxicants and medica- 
tions. As Jay recounts the variety of individu- 


Psychonauts: 

Drugs and the Making of 
the Modern Mind 

Mike Jay 

Yale University Press, 2023. 
376 pp. 


als carrying out drug testing, he transitions 
expertly from major figures to “their less fa- 
mous contemporaries,” meaning his protago- 
nists come from disparate realms of basic 
science, medicine, pharmacy, in addition to 
literature, history, philosophy. It also means 
that the reader is exposed to previously un- 
derexplored historical actors beyond Freud 
or James. Here, one of the book’s few mis- 
steps rests in Jay’s early vow to share how 
women and persons of color were present 
in this history. This promise—save for some 
scarce instances—goes mostly unfulfilled. 

Psychonauts, more generally, arrives at an 
auspicious moment in the history of intoxi- 
cants and drugs, a time when far-reaching 
regulatory changes are rattling the founda- 
tions of global drug policy. With the psy- 
chedelic class of drugs, for example, recent 
years have witnessed an uptick in venture 
capital investment, increased patent activ- 
ity and interest from the US Food and Drug 
Administration and the National Institutes 
of Health, as well as sustained research 
activities in the biomedical establishment. 
While gesturing toward this renaissance, 
Jay’s primary interest is not to advance spe- 
cific recommendations about psychedelic 
science in the present; rather, he sketches 
historical parallels around scientific experi- 
mentation with altered consciousness that 
enhance modern discussions. 

Beyond its fascination with the psyche- 
delic community and those steeped in drug 
and pharmaceutical history, Psychonauts has 
an even wider applicability. It furnishes in- 
sights about replication debates in scientific 
circles, particularly the contemporary repro- 
ducibility crisis. It also casts a bright light 
on the contentious evolution of our mental 
health crisis and the enduring struggle to 
develop efficacious psychopharmacological 
interventions. Lastly, Psychonauts lends crit- 
ical perspectives into how citizen scientists 
and activists operating on the margins of 
society or outside the mainstream biomedi- 
cal establishment may produce beneficial 
knowledge. To be sure, while Psychonauts 
centers on “old drugs,” “the therapeutic mas- 
ter-builders” of the 19th and 20th centuries, 
and a “young generation of healing-agents,” 
one of its biggest strengths is that it expands 
our minds in other critical ways. ® 
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Biological pest control 
protects pollinators 


Since 2016, the fall armyworm (Spodoptera 
Srugiperda) has invaded extensive areas 

in the Global South (1). Native to the 
Neotropics, this migratory noctuid moth 
annually colonizes corn fields, where its 
larvae voraciously feed upon vegetative and 
reproductive tissues. In Africa alone, S. fru- 
giperda inflicts US$9.4 billion of damage 
per year (2). As a result, many small farms 
have increased their use of insecticides to 
mitigate pest-induced losses (7). Although 
these chemicals protect corn crops in the 
short term, they also put pollinators at 
risk. Biological pest management would 
protect pollinators and, in turn, long-term 
food security. 

For wild and domesticated bee pollina- 
tors, corn pollen and liquids excreted from 
leaves constitute a substantial, although 
temporally restricted, resource even 
in diverse farming landscapes (3). Bee 
colonies rely extensively on these dietary 
resources during periods of pollen deficien- 
cies in simplified agroecosystems or under 
extended drought, when floral nectar is 
scarce (4, 5). In tropical settings with con- 
tinuous corn cropping, corn pollen features 
prominently among honeybee foraging 
resources year-round (6). 

Insecticide use has been shown to com- 
promise bee-mediated pollination in tem- 
perate regions (7). Given pollinators’ exten- 
sive exposure to corn pollen in the tropics 
and subtropics, insecticide use on corn 
crops will likely have the same effect. Bee 
colony health is particularly vulnerable in 
settings with insufficient sources of pollen 


SCIENCE science.org 


or nectar, which can be exacerbated by 
climate change (8). Considering that bees 
sustain human food security and nutrition 
(9), the cascading One Health impacts of 
insecticide-centered S. frugiperda control 
cannot be overlooked. 

Supplanting insecticides with agro- 
ecological and biodiversity-based alter- 
natives can either prevent or control S. 
Srugiperda outbreaks (10, 11). Increased 
usage of manure or compost, cropland 
diversification, conservation of natural 
habitat patches, and resistant maize 
cultivars would counteract pest build- 
up. Meanwhile, microbial biopesticides, 
botanicals, and scheduled releases of 
micro-wasps or insect-killing nematodes 
can provide curative control (1). 

Biodiversity-based strategies also 
strengthen natural pest regulation. Given 
its protein, carbohydrate, and micronutri- 
ent content, corn pollen is a coveted food 
resource for myriad beneficial arthropods 
(72). In the case of short-lived, annual crops, 
such as maize, pollen feeding allows car- 
nivorous ladybirds, ants, parasitic wasps, 
and ground-foraging sheet-web spiders 
to sustain their populations and provide 
(cost-free) biological control services. Many 
of these organisms provide effective S. 
Srugiperda biological control in its native 
habitats in the Americas and are slowly but 
steadily adapting to this pest in its invasive 
range (J). Replacing chemical insecticides 
with biodiversity-driven alternatives would 
uphold societal wellbeing in some of the 
world’s most food-insecure regions. 
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Effectively implementing 
biosecurity policies 


In her Science Insider story “U.S. scientists 
brace for greater scrutiny of risky research” 
(3 February, p. 422), J. Kaiser explains that 
recently issued recommendations (J) from 
the National Science Advisory Board for 
Biosecurity (NSABB) would require govern- 
ment oversight for all pathogen research. 
Although we strongly support the NSABB 
recommendations, we recognize that the 
disjointed federal oversight of biosafety 
and insufficient support could stymie their 
implementation. To operationalize effective 
biosecurity oversight systems, the US gov- 
ernment must provide regulators and local 
institutional staff with the appropriate tools 
and resources. 

The existing biosecurity framework 
includes multiple agencies, including the 
Department of Health and Human Services 
(DHHS), that have unique authorities and 
responsibilities. For the NSABB recom- 
mendations to work, the DHHS will need 
to unify oversight under a single agency, 
establish an intra-agency biosecurity gover- 
nance framework, or consider an alternative 
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biosecurity oversight mechanism. Any new 
US government biosecurity policies must 
also account for differences in dual-use 
procedures within various settings, such 

as universities, pharmaceutical compa- 

nies, public health laboratories, and other 
organizations that use biotechnology. The 
policy must address the impact on workload 
and training for staff including biosafety 
professionals, compliance personnel, and 
researchers. Because NSABB is recommend- 
ing expanded oversight for a broader range 
of experiments, biosecurity reviews will need 
to be integrated into life sciences research, 
including privately funded research, on an 
ongoing basis. 

Such changes will require federal and 
institutional investment. The Federal Select 
Agent Program (2), dual-use (3), and pan- 
demic pathogen (4) policies are unfunded 
mandates. To fully assess the fiscal impact of 
these recommendations, a baseline and peri- 
odic financial evaluation needs to be con- 
ducted before policies change. Supporting 
biosafety and biosecurity efforts could 
include an increased cost ceiling on federal 
and private grants and dedicated funding 
at institutions. Any future policy changes 
should allocate funding for personnel, infra- 
structure, and compliance operations. In 
addition, the agencies implementing these 
recommendations must be funded to cover 
the costs of training, educational resources, 
communication, audits, and enforcement. 

Moving forward, the NSABB should 
include more members that have practical 
experience running institutional biosecurity 
oversight programs. Members should repre- 
sent a variety of sectors to ensure a broader 
perspective in how these policies affect local 
research enterprises. Before implementation, 
there should be a thoughtful and thorough 
review of their impact on a range of stake- 
holders. As with the development of these 
NSABB recommendations, every step of the 
decision-making process should be trans- 
parent to the public and concerned citizens 
should have a voice in the outcome. 
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China bans electric 
capture of earthworms 


The earthworm is widely distributed, with 
4000 species in the world and more than 
300 species in China (J, 2), where they are 
in demand as fishing bait, livestock feed, 
and components of traditional medicine 
(3). In recent years, poachers in China 
have started using soil electrocution to 
capture earthworms, putting earthworm 
populations and ecosystems at risk. In 
February, China prohibited the practice. 
The decision is a necessary first step to 
protect agriculture, but to ensure that 
electric capture ceases, China must follow 
up with legislation to support the imple- 
mentation of the ban. 

Known as “ecosystem engineers,” earth- 
worms play a vital role in the biogeochem- 
ical cycle (4). They can affect the biologi- 
cal, chemical, and physical processes of 
the ecosystem through feeding, digestion, 
excretion, and burrowing. The earthworm 
population, which is closely related to 
soil microorganisms, soil animals, and 
soil enzyme activities, can increase soil 
organic matter, improve soil structure, 
and increase soil fertility (5). 

In the past, earthworms in China were 
captured manually, which limited the 
number that could be caught in a short 
amount of time and did not threaten 
the earthworm population or the soil 
ecosystem. In contrast, applying electric 
current to the soil can catch about 150 kg 
of earthworms in a day (6, 7). Removing 
earthworms at this scale threatens the 
species with local extinction and robs the 
targeted ecosystem of the benefits they 
provide (8). 

This capture method threatens other spe- 
cies as well. The electric current may kill or 
harm soil organisms beyond earthworms. 
The changes could disrupt the food cycle 


for birds, arthropods, and mammals that 
depend on earthworms as a food source (9, 
10). The sharp decrease of earthworms in the 
soil may also affect soil fertility and produc- 
tivity, reducing crop yields (77). 

Electric earthworm capture has quickly 
become widespread. Because earthworms 
are not included on the list of protected 
wild animals in China, there have been 
few legal ramifications for using the 
approach, despite the damage it causes. In 
February, the Chinese government took a 
step toward better agricultural steward- 
ship when it unveiled its agriculture plan 
(12), which explicitly prohibits electric 
capture of earthworms, poaching of pha- 
eozem, and other activities that damage 
the soil ecosystem. However, the financial 
benefits of large-scale earthworm poaching 
may provide poachers with an incentive to 
continue using electric capture despite the 
ban. To implement the ban effectively, China 
should pass legislation that specifies how 
poachers will be penalized if they continue 
to use soil electrocution. In addition, law 
enforcement agencies should be authorized 
to take action if they find poachers disre- 
garding the policy. 
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Aretrotransposon zooms — 
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SLEEP 


Sleeping deep 


leep is essential, but not all mammals live in environments where long periods of time 
asleep are possible. Marine mammals encounter especially challenging conditions for 
sleep when they are at sea. Using advanced remote monitoring techniques, Kendall-Bar 
et al. found that wild northern elephant seals can sleep for less than 2 hours per day at sea 
and do so while diving to depths of around 300 meters. Unlike other marine mammals, 
they enter full REM sleep, with accompanying paralysis, but they do so at depths below those 


occupied by their predators. —SNV Science, adf0566, this issue p. 260 


Although northern elephant seals do sleep on land, like the one pictured here in California, they can also sleep 


while diving to 300 meters underwater. 


PROTEIN DESIGN 
Complex architectures by 
top-down design 


Designing artificial protein com- 
plexes based on components 
with established structural 
properties can impose limits 

on the properties of the final 
complex. Lutz et al. developed 
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methods for generating com- 
plex protein architectures that 
adhere to preordained param- 
eters using reinforcement 
learning. They demonstrated 
this approach by generating 
designs that fill arbitrary vol- 
umes, including a symmetrical 
connector between previously 
designed protein rings. A small 


protein designed to assemble 
into 60-subunit icosahedra 
may be useful for presenting 
antigens in vaccines or signal- 
ing molecules in multivalent 
agonist complexes, as the 
authors demonstrate in 
preliminary biological experi- 
ments. —MAF 

Science, adf6591, this issue p. 266 


in on its target 
Much of the human genome 
consists of repetitive 
sequences called transpo- 
sons that can copy and paste 
themselves into new DNA 
locations. The most common, 
and only demonstrably active, 
type of transposon in humans 
is the non-long terminal repeat 
retrotransposon, which uses 
an RNA intermediate that is 
converted into DNA directly 
at the site of a new insertion. 
Wilkinson et al. used cryo— 
electron microscopy to show 
what determines the sequence 
that is inserted and where it 
is inserted. They also show 
that the specificity can be 
altered, suggesting that these 
retrotransposons could be 
engineered as gene insertion 
tools. —DJ 

Science, adg7883, this issue p. 301 


CANCER EVOLUTION 


A tale of two cancers 
Already threatened by habitat 
loss and past persecution, 
Tasmanian devil populations 
have been decimated over the 
past few decades by a fatal 
transmissible cancer referred 
to as DFT1. Some recent 
research has suggested that 
the growth and spread of this 
cancer has slowed, and it may 
have become endemic, leading 
to lower mortality. However, 
Stammnnitz et al. describe the 
evolution of a second transmis- 
sible facial cancer in Tasmanian 
devils, DFT2. This lineage 
evolved more recently, mutates 
more rapidly, and is a high-level 
threat to this species. —SNV 
Science, abq6453, this issue p. 283 


SEXUAL SELECTION 
A passing advantage 


Female choice plays a large 

role in shaping populations. 
Across many species, females 
have been shown to prefer 
males with traits that are rare or 
uncommon. How this prefer- 
ence is maintained over time 
has remained an open question. 
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Potter et al. looked across 
generations in Trinidadian gup- 
pies and found that females do 
have a clear preference for rare 
males, and that they acquire a 
further fitness benefit through 
sons that also have these rare 
traits. Once rare traits become 
more common, however, this 
fitness benefit dissipates such 
that rare traits in the father 
eventually become common, 
leaving the grandsons to be less 
preferred. —SNV 

Science, ade5671, this issue p. 309 


Twist and confinement 
Optical fibers form the backbone 
of the modern information age, 
with information encoded in 
various properties of light: wave- 
length, polarization, intensity, 
and phase. These modes rely on 
total internal reflection within the 
fiber. To further increase capac- 
ity, propagating light in different 
spatial modes within the fiber 
is possible, but this is compli- 
cated by mode mixing, in which 
the information is corrupted. 
Using light with orbital angular 
momentum, Ma et a/. demon- 
strate an optical fiber designed 
with a topological feature that 
confines the light separate from 
that of total internal reflection, 
thus avoiding the detrimental 
issues related to mode mixing. 
These results suggest a route 
for developing higher-capacity 
optical networks. —ISO 

Science, add1874, this issue p. 278 


Representative modes in an optical 
fiber illustrated as spiral patterns 
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The secret of ancient 
Maya masonry 


The addition of organic 
materials in the making of 
lime plasters is key to achiev- 
ing the remarkable stability 

of pre-Columbian mortars 
and stuccoes despite expo- 
sure to ahot and humid 
tropical environment. Studying 
Maya plasters from Copan, 
Honduras, Rodriguez-Navarro 
et al. found organic materials 
and nanostructures of calcite 
biominerals analogous to 
those found in seashells. By 
testing lime plaster replicas in 
which bark extracts from local 
trees in Copan were added, 
the authors showed that the 
presence of these organic addi- 
tives enhanced the toughness 
and weather resistance of the 
plasters. Optimized lime- 
based plasters may be useful 
for heritage conservation and 
modern sustainable construc- 
tion. —JKD 


Sci. Adv. (2023) 
10.1126/sciadv.adf6138 


Fighting flu 
A major goal for influenza 
vaccines is to elicit broadly 
reactive immune responses 
that can protect against many 
strains of the virus. Two stud- 
ies report results of a phase 1 
clinical trial testing a vaccine 
that may get closer to this goal. 
Widge et al. demonstrated that 
immunization with one or two 
doses of an H1 hemagglutinin— 
stabilized stem nanoparticle 
(H1ssF) vaccine was safe in 
recipients and elicited dura- 
ble neutralizing antibody 
responses. Andrews et al. found 
that memory B cell responses 
elicited by H1ssF vaccination 
were broadly cross-reactive 
and targeted two conserved 
epitopes on the H1 stem. These 
studies highlight the potential 
of H1ssF and similar stem-only 
immunogens as influenza vac- 
cines. —CSM 
Sci. Transl. Med. (2023) 
10.1126/scitranslmed.ade4790, 
10.1126/scitransilmed.ade4976 
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Stabilizing COFs with NO 


The assembly of covalent 
organic frameworks (COFs) 
relies on reversible covalent 
bond formation to heal any 
structural defects, but such 
bonds could also lead to low 
stability in sorption applications, 
especially if reactive gases such 
as nitric oxide (NO) are being 
adsorbed. Emmerling et al. 
found that NO exposure actually 
passivated unreacted terminal 
amines in COFs through denitro- 
genation. Whereas imide and 
thiazole linkages were largely 
unreactive, NO caused conver- 
sion of imine and amine linkages 
that changed the COF surface 
area and crystallinity. The 
amine linkages could be con- 
verted to N-diazeniumdiolates 
(“NONOate”) linkages at room 
temperature. These modified 
COFs released NO under 


Edited by Caroline Ash 
and Jesse Smith 


physiological solution condi- 
tions and offer a platform for 
the delivery of NO in vivo as a 
signaling molecule. —PDS 
J.Am. Chem. Soc. (2023) 
10.1021/jacs.2c11967 


The hormonal X factor 
Postmenopausal women com- 
prise the majority of Alzheimer’s 
disease patients (70%). This 
population has been shown to 
have increased toxic neuronal tau 
protein deposition compared with 
age-matched men. However, how 
sex, age, and hormone replace- 
ment therapy (HRT) affect 

tau deposition in cognitively 
unimpaired individuals remains 
unclear. Using tau imaging in 
cognitively normal postmeno- 
pausal women and age-matched 
men, Coughlan et a/. showed that 
women had higher tau deposi- 
tion in several brain areas. Early 
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MATERIALS SCIENCE 
Improving water 
treatment membranes 


Converting brackish or waste 
water to clean drinking water 
requires semipermeable 
reverse osmosis membranes 
that permit water flow while 
retaining contaminants. Despite 
research that claims to improve 
on existing membrane materials, 
the widely used gold standard 
reverse osmosis membrane has 
not changed in decades. Ina 
Perspective, McCutcheon and 
Mauter discuss the problems 
with translating the development 
of new materials to commercial 
application. In particular, they 
highlight the need to contextual- 
ize materials development within 
the entire water treatment pro- 
cess, test new materials under 
relevant conditions, and assess 
manufacturability. Changing 
evaluations in this way could 
improve the application of mate- 
rials science to the treatment of 
contaminated water. -GKA 
Science, ade5313, this issue p. 242 


SUSTAINABILITY 
Facing coupled 
environmental crises 


Humanity is facing major social 
and ecological impacts from 
climate change and biodiver- 
sity loss. These two crises are 
intertwined, with common 
causes and effects on one 
another. Pértner et al. review 
the results of a joint meeting of 
members of the International 
Panels on Climate Change and 
Biodiversity and Ecosystem 
Services. They discuss the con- 
nections between biodiversity 
loss and climate change and 
propose potential solutions for 
addressing them as intercon- 
nected problems. Drastic 
reductions in greenhouse gas 
emissions, protection of multi- 
use landscapes and seascapes, 
and policies for providing 
equitable access to natural 
resources can help to ensure 
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future ecological function and 
human well-being. —BEL 
Science, abl4881, this issue p. 256 


GENOME EDITING 
Base editing in a single 
treatment 


Spinal muscular atrophy is the 
leading genetic cause of infant 
death. It arises from the lack of 
a protein called survival motor 
neuron (SMN). Drugs that 
increase SMN are effective but 
require repeated dosing or may 
fade over time. Arbab et al. iden- 
tified genome-editing strategies 
that permanently correct SMN 
protein levels to normal levels 
by converting a partially active 
gene encoding SMN into a fully 
active form. In a mouse model, 
treatment with base editors that 
efficiently and precisely make 
this change increased life span 
and rescued motor function. A 
one-time combination treatment 
of a base editor and a current 
spinal muscular atrophy drug 
further improved outcomes in 
mice. —DJ 

Science, adg6518, this issue p. 257 


CANCER 
RNA surveillance turns 
oncogenic 


Precise RNA regulation is a qual- 
ity control mechanism that is 
important for normal develop- 
ment and to prevent disease. 
Aberrant RNAs require identifi- 
cation and destruction to avoid 
translation of defective proteins. 
Insco et al. report that cyclin- 
dependent kinase 13 (CDK13), 
which activates an RNA surveil- 
lance mechanism to degrade 
abnormal RNAs, also has a 
tumor-suppressor function (see 
the Perspective by Fisher). When 
CDK13 was mutated in an animal 
melanoma model, accumulation 
and translation of aberrant RNAs 
resulted in more aggressive 
malignancy. Analysis of other 
cancer types revealed similar 
CDK13 mutations and showed 
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that additional RNA surveillance 
genes were recurrently mutated 
in human tumors. These findings 
suggest that RNA surveillance 
may have a previously unrecog- 
nized tumor-suppressive role. 
—PNK and YN 

Science, abn7625, this issue p. 258; 

see also adh4051, p. 240 


METABOLISM 
Linking AMPK to organelle 
biogenesis 
The kinase AMPK is a key sen- 
sor that helps to control energy 
homeostasis. Malik et al. reveal 
the mechanism by which AMPK 
controls the transcription 
factor TFEB to increase gene 
transcription and to support 
mitochondrial and lysosomal 
biogenesis. AMPK appears 
to act by direct phosphoryla- 
tion of folliculin-interacting 
protein 1 (FNIP1). FNIP is part 
of acomplex that acts as a 
GTP-activating protein for the 
GTPases RagC and RagD, which 
regulate the mechanistic target 
of rapamycin complex 1 protein 
kinase signaling complex on the 
lysosomal surface. This results 
in release of TFEB from the lyso- 
some, allowing it to act at the 
nucleus. —LBR 

Science, abj5559, this issue p. 259 


QUANTUM MECHANICS 
Schrodinger’s cats, 
kittens, and lions 


The idea of Schrédinger’s cat 
being both alive and dead at the 
same time—its fate revealed 
only upon inspection—came 
from a thought experiment 

that pointed out an absurdity in 
the interpretation of quantum 
mechanics at the time. However, 
because such superposition 
states have now been prepared 
in many different quantum 
systems, the question is where 
do the classical and quantum 
worlds part company? Bild 

et al. prepared, observed, 

and controlled cat states of 


a16-microgram mechanical 
resonator. Being able to control 
the size of the superposition 
states, they effectively created 
a menagerie of quantum states, 
thus providing a platform to 
explore the boundary between 
the quantum and classical 
behavior. —ISO 

Science, adf7553, this issue p. 274 


NEUROEVOLUTION 
View of an ancient brain 


The evolutionary origin of 
nervous systems remains 
a fundamental question in 
biology. A hallmark of ner- 
vous systems is that they are 
composed of discrete cells 
(neurons) that communicate 
through synapses. Ctenophores, 
a sister group to all animals with 
nervous systems, play a key role 
in comparative studies into the 
evolutionary origin(s) of neu- 
rons and their connections. To 
establish neuronal circuits that 
facilitate ctenophore behavior, 
Burkhardt et al. used high- 
resolution three-dimensional 
electron microscopy, revealing 
that nerve-net neurons are not 
separate entities, but rather 
are interconnected through 
continuous neurite plasma 
membranes without evidence of 
synapses (see the Perspective 
by Dunn). The findings offer a 
new perspective on the evolu- 
tion of neuronal networks and 
neurotransmission. —MMa 
Science, ade5645, this issue p. 293; 
see also adhO542, p. 241 


CONSERVATION 


Private land protection 
To protect high rates of biodi- 
versity, a large amount of global 
and must be under some sort 
of protection. In regions where 
public land is not prioritized nor 
widely distributed, it is pos- 
sible that protection of private 
and could contribute to spe- 
cies conservation. In Brazil, a 
native vegetation law instituted 
decades ago has provided an 
opportunity for evaluation of the 
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role of private land in conserving 
species. De Marco et al. looked 
at mammal species protected 
by these private set-asides in the 
Brazilian Cerrado and found that 
they covered up to 25% of spe- 
cies ranges (see the Perspective 
by Machado and Aguiar). Such 
areas play an even more impor- 
tant role when ecologically intact 
or restored. —SNV 

Science, abq7768, this issue p. 298 

see also adh1840, p. 238 


IMMUNOLOGY 
The master of MAIT 
cell metabolism 


Mucosal-associated invariant 
T (MAIT) cells are a type of 
innate-like T cells that recognize 
bacterial metabolites. Upon 
activation, they proliferate and 
produce cytokines to promote 
host defense. Kedia-Mehta et al. 
show that MAIT cell proliferation 
depends on MYC-associated 
pathways involving amino 
acid transport and glycolysis. 
Furthermore, MAIT cells from 
patients with obesity show 
disrupted function and engage- 
ment of these pathways. These 
metabolic pathways may be 
relevant to the development 
of MAIT cell-based therapies. 
—AEB 
Sci. Signal. (2023) 
10.1126/scisignal.abo2709 


NEUROIMMUNOLOGY 
Dispensable dural 
lymphatics 

Lymphatic vessels in the dura, 
the outermost layer of the 
meninges, provide a vascular 
path for immune cells con- 
necting the meninges with the 
systemic circulation. Dural 
lymphatics have been pro- 
posed as a gateway that T cells 
targeting CNS autoantigens 
use to access the brain and 
spinal cord. The formation and 
maintenance of dural lymphat- 
ics can be abrogated by genetic 
or pharmacologic interference 


SCIENCE science.org 


with vascular endothelial growth 
factor C (VEGF-C) or its recep- 
tor, VEGFR3. Li et a/. found that 
atrophy of dural lymphatics by 
VEGFR3 blockade in mice was 
insufficient to block autoimmune 
neuroinflammation initiated 
by immunization with myelin 
autoantigens or transfer of 
encephalitogenic T cells. These 
findings suggest that therapies 
aimed at disrupting dural lym- 
phatics are unlikely to attenuate 
human autoimmune neuroin- 
flammatory diseases such as 
multiple sclerosis. —IRW 
Sci. Immunol. (2023) 
10.1126/sciimmunol.abq0375 
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Potter et al. looked across 
generations in Trinidadian gup- 
pies and found that females do 
have a clear preference for rare 
males, and that they acquire a 
further fitness benefit through 
sons that also have these rare 
traits. Once rare traits become 
more common, however, this 
fitness benefit dissipates such 
that rare traits in the father 
eventually become common, 
leaving the grandsons to be less 
preferred. —SNV 

Science, ade5671, this issue p. 309 


Twist and confinement 
Optical fibers form the backbone 
of the modern information age, 
with information encoded in 
various properties of light: wave- 
length, polarization, intensity, 
and phase. These modes rely on 
total internal reflection within the 
fiber. To further increase capac- 
ity, propagating light in different 
spatial modes within the fiber 
is possible, but this is compli- 
cated by mode mixing, in which 
the information is corrupted. 
Using light with orbital angular 
momentum, Ma et a/. demon- 
strate an optical fiber designed 
with a topological feature that 
confines the light separate from 
that of total internal reflection, 
thus avoiding the detrimental 
issues related to mode mixing. 
These results suggest a route 
for developing higher-capacity 
optical networks. —ISO 

Science, add1874, this issue p. 278 


Representative modes in an optical 
fiber illustrated as spiral patterns 
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The secret of ancient 
Maya masonry 


The addition of organic 
materials in the making of 
lime plasters is key to achiev- 
ing the remarkable stability 

of pre-Columbian mortars 
and stuccoes despite expo- 
sure to ahot and humid 
tropical environment. Studying 
Maya plasters from Copan, 
Honduras, Rodriguez-Navarro 
et al. found organic materials 
and nanostructures of calcite 
biominerals analogous to 
those found in seashells. By 
testing lime plaster replicas in 
which bark extracts from local 
trees in Copan were added, 
the authors showed that the 
presence of these organic addi- 
tives enhanced the toughness 
and weather resistance of the 
plasters. Optimized lime- 
based plasters may be useful 
for heritage conservation and 
modern sustainable construc- 
tion. —JKD 


Sci. Adv. (2023) 
10.1126/sciadv.adf6138 


Fighting flu 
A major goal for influenza 
vaccines is to elicit broadly 
reactive immune responses 
that can protect against many 
strains of the virus. Two stud- 
ies report results of a phase 1 
clinical trial testing a vaccine 
that may get closer to this goal. 
Widge et al. demonstrated that 
immunization with one or two 
doses of an H1 hemagglutinin— 
stabilized stem nanoparticle 
(H1ssF) vaccine was safe in 
recipients and elicited dura- 
ble neutralizing antibody 
responses. Andrews et al. found 
that memory B cell responses 
elicited by H1ssF vaccination 
were broadly cross-reactive 
and targeted two conserved 
epitopes on the H1 stem. These 
studies highlight the potential 
of H1ssF and similar stem-only 
immunogens as influenza vac- 
cines. —CSM 
Sci. Transl. Med. (2023) 
10.1126/scitranslmed.ade4790, 
10.1126/scitransilmed.ade4976 
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Stabilizing COFs with NO 


The assembly of covalent 
organic frameworks (COFs) 
relies on reversible covalent 
bond formation to heal any 
structural defects, but such 
bonds could also lead to low 
stability in sorption applications, 
especially if reactive gases such 
as nitric oxide (NO) are being 
adsorbed. Emmerling et al. 
found that NO exposure actually 
passivated unreacted terminal 
amines in COFs through denitro- 
genation. Whereas imide and 
thiazole linkages were largely 
unreactive, NO caused conver- 
sion of imine and amine linkages 
that changed the COF surface 
area and crystallinity. The 
amine linkages could be con- 
verted to N-diazeniumdiolates 
(“NONOate”) linkages at room 
temperature. These modified 
COFs released NO under 
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physiological solution condi- 
tions and offer a platform for 
the delivery of NO in vivo as a 
signaling molecule. —PDS 
J.Am. Chem. Soc. (2023) 
10.1021/jacs.2c11967 


The hormonal X factor 
Postmenopausal women com- 
prise the majority of Alzheimer’s 
disease patients (70%). This 
population has been shown to 
have increased toxic neuronal tau 
protein deposition compared with 
age-matched men. However, how 
sex, age, and hormone replace- 
ment therapy (HRT) affect 

tau deposition in cognitively 
unimpaired individuals remains 
unclear. Using tau imaging in 
cognitively normal postmeno- 
pausal women and age-matched 
men, Coughlan et a/. showed that 
women had higher tau deposi- 
tion in several brain areas. Early 
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there is a need for methods 
to efficiently recover valuable 
metals from lithium-ion bat- 
teries once they reach their 
end of useful life. Quintana et 
al. developed a liquid extrac- 
tion process that uses rotating 
reactors to stir, separate, and 
capture metal ions. Using 
either a vertical or horizontal 
configuration, the reactors 
contain a higher pH aqueous 
feed and a lower pH aque- 
ous acceptor phase that are 
kept separate by an inorganic 
solvent. The organic layer 
contains low concentrations 
of extractants that shuttle 
manganese and cobalt from 
the feed, and because they 
will cycle multiple times, 
much less of these elements 
is needed compared with 
conventional extraction tech- 
niques. —MSL 


PLANT BREEDING 


Ethiopian farmers 
inform wheat breeding 


ocal farming practices are 
best understood by the farm- 
ers themselves. Gesesse et al. 
combined Ethiopian farmers’ 
knowledge with genetic data 
on durum wheat to assist breed- 
ing. Smallholder farmers from 
three sites were skilled at assess- 
ing and differentiating among 
durum wheat varieties. Farmer 
preferences varied in relation to 
yield, morphology, and phenology 
depending on locality. Preference 
scores were used in genomic 
selection and association analyses 
to identify alleles underlying 
performance. Some loci colocated 
with yield, but others were inde- 
pendent, indicating that they may 
relate to alternative traits such as 
those relating to potential for local 


Aav. Mater. (2023) 
10.1002/adma.202211946 


menopause and late initiation 

of HRT were associated with 
increased neuronal tau compared 
with late menopause and early 
HRT. —MMa 


JAMA Neurol. (2023) 
10.1001/jamaneurol.2023.0455 


IMMUNE ADAPTATION 
Old selection affects 
modern health 


The immune system is a 
constant and repeated target 
of natural selection. Kerner et 
al. used a wide array of ancient 
and modern European DNA to 
assess Selection on immune- 
related alleles in humans. 

The authors modeled allele 
frequency trajectories for over 
1 million variants, finding 89 
independent loci predicted to be 
under positive selection since 
the Neolithic period. Many of 
these variants were associated 
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with hematopoietic traits such 
as platelet and reticulocyte 
counts, autoimmune disease, 
and infection. This finding 
indicates that these traits have 
been directly affected by selec- 
tion in Europeans and supports 
the idea that increased protec- 
tion from infectious disease 
may concomitantly increase the 
risk for autoimmunity. —CNS 
Cell Genom. (2023) 
10.1016/j.xgen.2022.100248 


MICROBIOTA 


Not all fiber is the same 
A healthy diet, we are told, 
demands a high fiber intake 

to sustain a healthy micro- 
biota. However, fiber is a 
heterogenous substance, and 
its composition is likely to 
influence the range of microor- 
ganisms that ferment it. Using 
fecal batch cultures, Solvang et 
al. tested microbial growth ona 


adaptation. -MRS 


Proc. Natl. Acad. Sci. U. S.A. (2023) 
10.1073/pnas.2205774119 


Detail of a wheat field in the highlands 
near Lalibela, Amhara Region, Ethiopia 


variety of edible plant extracts. 
At 72 hours, fermentation 
products, bacterial abundance, 
and community composition 
were measured. More complex 
substrates supported a more 
variable community, which also 
varied by fiber source. For exam- 
ple, high arabinan from beetroot 
and high galactan from carrots 
predicted specific microbial 
enrichment. Thus, knowledge of 
dietary fiber composition will be 
valuable in designing diets that 
optimize desirable composi- 
tion and function in a patient's 
microbiota. —CA 

Environ. Microbiol. (2023) 

10.1111/1462-2920.16368 


MATERIALS SCIENCE 
Recovering metals from 


lithium-ion batteries 


With the growth in electric 
vehicles and grid storage, 
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SCIENCE EDUCATION 
An arbitrary gatekeeper 


Another blow is delivered against 
the predictive power of the 
Graduate Record Examination 
(GRE). Feldon et al. performed 

a meta-analysis to assess the 
utility of the GRE as a predictor 
across outcome variables for 
graduate students in the United 
States. The team also assessed 
changes in observed effects over 
time as related to increasing 
diversity in the graduate student 
population. Their results showed 
that 61.6% of the reported 
effects showed no predictive 
value of GRE scores on student 
outcomes. Sample composi- 
tion effects by race or ethnicity 
were notable but nonsignificant, 
with increasing proportions of 
people of color within a study 
sample associated with poorer 
predictive validity, a result that 
should be of great interest to 
institutions working to increase 
diversity. This study provides 
additional evidence that remov- 
ing GRE requirements for 
graduate applications eliminates 
a source of unnecessary ineq- 
uity. -MMc 


J. High. Educ. (2023) 
10.1080/00221546.2023.2187177 
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SUSTAINABILITY 


Overcoming the coupled climate and biodiversity 
crises and their societal impacts 


H.-O. Pértner*, R. J. Scholes}, A. Arneth, D. K. A. Barnes, M. T. Burrows, S. E. Diamond, C. M. Duarte, 
W. Kiessling, P. Leadley, S. Managi, P. McElwee, G. Midgley, H. T. Ngo, D. Obura, U. Pascual, 


M. Sankaran, Y. J. Shin, A. L. Val 


BACKGROUND: Two intertwined crises threaten 
human well-being: Climate change—arising 
from human-induced greenhouse gas emis- 
sions, including those from the loss of biomass 
and biodiversity—is raising temperatures be- 
yond those of the Holocene, when human civ- 
ilization evolved and expanded globally. Mean 
warming and the increasing frequency and 
severity of extreme events in turn disturb eco- 
system functioning, cause habitat loss to hu- 
mans and biodiversity, and exacerbate the 
unprecedented loss of biodiversity already 
caused by human-induced habitat degradation, 
overexploitation of natural resources, and pol- 
lution. Both crises reduce nature’s contributions 
to people, which sustain well-being, livelihoods, 
economies, and development prospects and 
also support climate change adaptation and 
mitigation. Failing to act will increase human 
vulnerability, including poverty, food insecurity, 
involuntary displacement, and political in- 
stability and conflict. The coupled global cli- 


Hazards 

e.g., heatwaves, droughts, 

wildfires, floods, storms, 
sea level rise, 

ocean acidification, 

ocean oxygen loss 


Climate change 


mate and biodiversity crises and their societal 
impacts concern land, freshwater, and ocean 
ecosystems alike but are insufficiently tackled 
by current actions, as identified by assess- 
ments of both the IPCC (Intergovernmental 
Panel on Climate Change) and IPBES (Inter- 
governmental Science-Policy Platform on Bio- 
diversity and Ecosystem Services). None of 
the 20 2011-2020 Aichi biodiversity targets 
and none of the mileposts on climate trajec- 
tories intended to limit warming to 1.5°C have 
been met. 


ADVANCES: Climate, biodiversity, and societal 
challenges are intertwined but are often treated 
as singular problems. Solutions exist with co- 
benefits across sectors. In intact and functional 
ecosystems, nature is efficient at carbon cap- 
ture (by photosynthesis) and sequestration 
(long-term removal from the carbon cycle), pro- 
vided that warming is limited to 1.5°C or be- 
low through ambitious emissions reductions. 


Overfished stocks 
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Climate, biodiversity, and human society are coupled through dynamic interactions across scales. 
Human-caused exploitation and climate change are increasingly threatening biodiversity and nature's 
contributions to people, causing losses and damages exemplified through overfished stocks and excessive 


drought that harm productive habitats. 
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Strengthening the biosphere on land, in fr cigs 
water, and in the ocean will support clir..-2 
change mitigation, adaptation, biodiversity, 
human well-being, and livelihoods. Mounting 
scientific evidence points to the need to priori- 
tize protection of remaining undamaged carbon- 
and species-rich environments and to implement 
targeted restoration projects, with more atten- 
tion to effectively sustaining biodiversity and 
fairly distributed societal cobenefits. Three crit- 
ical objectives for future spatial planning include 
a habitable climate, self-sustaining biodiver- 
sity, and sustained provisioning of nature’s 
contributions to people to support development 
and a good quality of life for all. Coordinated 
efforts among science and policy can identify and 
help navigate development pathways toward 
climate resilience for both human society and 
biodiversity. 


OUTLOOK: New global biodiversity, climate, 
and sustainability targets envisioned for 2030 
and 2050 will likely fail if drivers behind cli- 
mate change and biodiversity loss remain in- 
sufficiently addressed or concrete actions to 
meet current political agreements and goals 
do not increase in pace and scale. The follow- 
ing actions are urgently needed. (i) Ambitious 
emissions reduction, combined with suitable 
adaptation measures. (ii) Effective protection 
of an average of 30 to 50% of surface areas 
across a mosaic of interspersed and intercon- 
nected land, ocean, and freshwater “scapes.” 
These cover a gradient, from pristine ecosys- 
tems; to spaces shared by humans and wild 
species and sustainably used; to spaces under 
intensive uses such as cities, which nonetheless 
can harbor substantial biodiversity in terres- 
trial and aquatic spaces. Efforts need to con- 
sider the specific spatial demands for healthy 
ecosystems. (iii) Building of development path- 
ways and the underpinning of political, eco- 
nomic, and social institutions (including norms 
and rules) on visions such as collective respon- 
sibility, sustainable and circular uses of natural 
resources, avoidance of overconsumption and 
waste, and more equitable and participatory 
development regionally and globally. (iv) The 
enabling of just and equitable access to and 
benefits from natural assets across societies, 
groups, and individuals, securing good quality 
of life. Transformative action can overcome 
siloed approaches through institutional and 
individual change, achieving sustainability for 
nature and people, as well as human, ecosys- 
tem, and planetary health. = 
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SUSTAINABILITY 


Overcoming the coupled climate and biodiversity 
crises and their societal impacts 


H. 
C. 
D. 


-O. Pértner’*, R. J. Scholes*+, A. Arneth’, D. K. A. Barnes®, M. T. Burrows®, S. E. Diamond’, 
M. Duarte®®, W. Kiessling”, P. Leadley", S. Managi’2, P. McElwee’, G. Midgley"*, H. T. Ngo!°®, 
Obura’”"®, U. Pascual’?:?°71, M. Sankaran”, Y. J. Shin’, A. L. Val?4 


Earth's biodiversity and human societies face pollution, overconsumption of natural resources, 
urbanization, demographic shifts, social and economic inequalities, and habitat loss, many of which 
are exacerbated by climate change. Here, we review links among climate, biodiversity, and society and 
develop a roadmap toward sustainability. These include limiting warming to 1.5°C and effectively conserving 
and restoring functional ecosystems on 30 to 50% of land, freshwater, and ocean “scapes.” We envision a 
mosaic of interconnected protected and shared spaces, including intensively used spaces, to strengthen 
self-sustaining biodiversity, the capacity of people and nature to adapt to and mitigate climate change, 
and nature’s contributions to people. Fostering interlinked human, ecosystem, and planetary health for a 
livable future urgently requires bold implementation of transformative policy interventions through 
interconnected institutions, governance, and social systems from local to global levels. 


limate change and biodiversity loss ar- 
guably represent the most substantial 
challenges to ecosystem health and sup- 
ply of nature’s contributions to people 
(NCP) (1). They also directly and indi- 
rectly affect human health and well-being and 
the functioning and sustainability of human 
societies at the global scale (2, 3). Increases in 
greenhouse gas emissions now exceed 55 bil- 
lion tonnes of carbon dioxide equivalents (Gt 
CO,e) year ’ and have resulted in warming by 
more than 1.1°C above preindustrial times, al- 
tered precipitation regimes, sea level rise, more 
extreme weather events, and oxygen depletion 
and acidification of aquatic environments (4). 
Climate change and other anthropogenic drivers 
of biodiversity loss—such as agricultural expan- 
sion and intensification, pollution, overfish- 
ing, and invasive alien species introductions— 
independently and together stress and disrupt 
ecological communities (2). Human activities 
have modified nearly 75% of Earth’s land sur- 
face and 66% of its ocean area (3, 5), leading to 
the loss of more than 80 and 50% of wild mam- 
mal and plant biomass, respectively (6), and 
threatening more species with extinction than 
ever before in human history (7). Moreover, cli- 
mate change and biodiversity loss are inextrica- 
bly connected, with each stressor contributing 
to and exacerbating the effects of the other. 
Climate change is causing species shifts and 
biodiversity loss globally because species are 
confined to limited thermal performance ranges 
and therefore geographical ranges. Climate 
warming increasingly causes species to move 
(Fig. 1, A and C) to higher latitudes (poleward), 
higher altitudes (on land), or deeper water. Range 
shifts lead to changes in species interactions, 
with cascading effects on species abundances, 
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species composition, and ecological functions. 
When species cannot track the changing climate 
or when suitable habitat shrinks, warming 
leads to mortality and biodiversity loss, as has 
been observed in the tropics (Fig. 1B). Some 
species in coral reefs, savannahs, rain forests, 
high-latitude and -altitude ecosystems, and Med- 
iterranean systems show evidence of exceed- 
ance of tolerance and adaptation limits (2). 
Mountains and continental boundaries may 
become climate dead ends because climate 
belts shift off the top of mountains, at the ends 
of a continent, or on islands. Extinction risks 
are high for species with narrow geographic 
ranges or specific habitat requirements, and 
those in the tropics and in polar regions (2). 
Climate change surpassing species’ adapta- 
tion limits constrains options to protect those 
species from extinction (2). Overall, climate 
change and sea level rise are projected to ex- 
acerbate the direct impacts of human activities, 
causing further losses in biomass, habitats, and 
species (2, 8-10). 

Conversely, biodiversity loss contributes to 
climate change through loss of wild species and 
biomass. This reduces carbon stocks and sink 
capacity in natural and managed ecosystems, 
increasing emissions (2, 4). Habitat loss and dis- 
turbance by human activities (such as through 
deforestation or expansion of livestock pro- 
duction or pollution) are the primary cause of 
biodiversity loss, increasingly exacerbated by 
climate change. Feedbacks of biodiversity loss 
on climate and of climate change on biodiver- 
sity include temperature-induced changes in 
photosynthetic capacity and carbon storage, 
modified reflectivity of the land surface, al- 
tered formation of clouds and atmospheric 
dust, and shifted biogeochemical cycling of 
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nutrients and carbon, which in turn influence 
the concentration of greenhouse gases in the 
atmosphere (I-16). 

Biodiversity loss and climate change are 
thus both drivers and consequences of one 
another, tightly linked to human activities or 
demographic change and resulting in nega- 
tive impacts on NCP, human health and well- 
being, as well as societal functioning [for 
example, (17, 18)]. Further, the human species 
is also losing habitat, through sea level rise and, 
more widely, through prolonged exposures to 
lethal temperature-humidity combinations out- 
side humans’ thermal niche, especially at but 
not limited to the lower latitudes (2). Yet these 
interlinkages are not sufficiently recognized 
in policies and multilateral agreements to 
limit climate change and halt biodiversity loss 
and societal consequences, particularly as 
the United Nations Framework Convention 
on Climate Change (UNFCCC) and the Con- 
vention on Biological Diversity (CBD) have 
largely addressed these issues independently. 
Policies and actions that simultaneously ad- 
dress climate, biodiversity, and society (the 
“nexus”) can help to avoid dangerous trade-offs 
and maximize cobenefits for humankind (/8). 

Accordingly, various aspects of the “solution 
space” are developed in the following sec- 
tions. The discussion includes (i) strategies to 
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Fig. 1. Latitudinal shifts in 
marine species richness 
(12,796 species from 

23 phyla) illustrate the 
responses of species to 
global warming. (A) 
Projected shifts reflect a 
scenario-dependent impover- 
ishment of species richness 
at low latitudes. [Data are 
from (10), reproduced from 
(2).] (B) The projection is 
supported by present-day 
observations of reduced bio- 
diversity at the equator 
compared with slightly higher 
Northern and Southern lat- 
itudes (48,661 marine animal 
species included across lat- 


A Projected changes in global marine species richness 
in 2100 compared to 2006 


RCP8.5 =+4.3°C Global warming level 


itudes) [data are from (149)], 
reflecting that heat limits are 
already surpassed for Meta- 
zoa in warm ocean waters 
(150). Parallel phenomena 
are projected to occur on 
land (2). (C) The conceptual 
graph illustrates how species’ 
thermal performance curves 
result from adaptation to the 
temperature regime 

of their biogeographical area. 
Overlapping curves define 


the temperature range and 
limits of species’ coexistence 
and interactions. Shifts in 
coexistence and ecosystem 
dynamics would result from 


Change in species richness for a suite of taxonomic 
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differential latitudinal shifts and from species-specific constraints on thermal ranges elicited by hypoxia and/or COz enrichment (ocean acidification) (8, 18, 151). 
Surpassing such limits may trigger fatal tipping points in ecosystems and dependent societies. 


address biodiversity loss and climate change, 
and their impacts on each other; (ii) proposed 
actions to address both crises together with 
consideration of societal needs; underpinned by 
(iii), the nexus between climate, biodiversity, 
and society, as well as the enablers and urgency 
of actions leading to transformative change. 


Protecting biodiversity in a changing climate 
by multifunctional “scapes” 


Conservation of biodiversity through spatial 
planning, conservation, and restoration is in- 
creasingly seen as an existential requirement 
for humankind (as addressed by CBD 2022). It 
is a flagship target of many conservation and 
restoration programs, including the Kunming- 
Montreal Global Biodiversity Framework of 
the CBD, the UN Decade on Ecosystem Resto- 
ration (2021-2030), and the Declaration on 
Forests and Land Use at the UNFCCC Confer- 
ence of the Parties (COP) 26 (79). Global esti- 
mates of the average surface area required to 
be effectively protected for self-sustaining bio- 
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diversity and long-term ecosystem health range 
from 30 to 50% for land and ocean (20-22), 
but to integrate this target with multiple social 
objectives of human well-being, including equity 
and food security, is challenging (23-26). Fur- 
ther research is needed to derive biome-specific 
guidance and targets (27, 28). For example, 
much higher protection, even 80%, might be 
needed to secure the Amazon’s capacity to 
maintain its own regional climate regime and 
thereby biodiversity and global carbon seques- 
tration capacity (29-32). For maintaining self- 
sustaining biodiversity in a habitable climate, 
the conservation of biodiversity- and carbon- 
rich natural ecosystems across latitudes—such 
as diverse forests, peatlands, and seagrass 
meadows—have highest priority. The conser- 
vation of spatially intact ecosystems can reduce 
risks to health, such as the risk of a pandemic, 
and implementation of some managed eco- 
system types reduces climate risks to food se- 
curity, such as through agroforestry (24, 33-36). 
Conservation of, for example, coral reefs se- 
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cures multifunctional benefits that support 
the UN Sustainable Development Goals (SDGs) 
(37). More encompassing conservation actions, 
extending beyond the historical focus on pro- 
tected areas and species-based approaches, are 
thus required to safeguard biodiversity, sup- 
port climate change mitigation and adapta- 
tion (28, 38, 39), and sustain NCP (J, 28, 40), 
including secure livelihoods for Indigenous 
Peoples and local communities (47). These 
ecosystem benefits include enhancing water 
and air quality; reducing soil erosion; ensuring 
pollination of crops; supporting food security 
and livelihoods; and regulating floods, regional 
climates, and microclimates. Climate change 
is already affecting access to or availability 
of many of these benefits (40). It is likely that 
with successful conservation and restoration 
efforts, human physical and mental health 
will also benefit (2). 

A substantial conservation gap exists toward 
agreed (for example, Aichi biodiversity targets) 
and even more so, envisioned goals. As of 2020, 
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only 15% of land was declared as protected, and 
only a fraction of that area was effectively pro- 
tected, given threats of encroachment, loss of 
legal status, and shifts of biodiversity into sur- 
rounding areas that are more intensively used 
(42, 43). Protection rates are still below 10% in 
the ocean, again with a smaller fraction being 
highly protected (23). Moreover, the dichotomy 
between protected and unprotected zones 
has resulted in failures to address connectivity, 
spatial mosaic dynamics, and human needs. 
This calls for complementing conservation with 
extensive restoration efforts. 

Restoration intervenes when high degrees 
of ecosystem degradation prevent spontane- 
ous recovery. Restoration of ecosystems with 
native species enhances climate resilience of 
biodiversity but, in light of shifted biogeog- 
raphies under climate change, may include 
new species assemblages to match future cli- 
matic conditions. Restoration can involve bio- 
diversity offsetting, which aims to mitigate the 
negative impacts of infrastructure develop- 
ment, agricultural expansion, or mining by 
restoring biodiversity elsewhere. In light of 
shifting climate zones, restoration may need 
to do the same under climate change. However, 
if biodiversity-offsetting programs limit local 


Fig. 2. Multifunctional 
connected and structur- 
ally and visibly distinct 
scapes across land, 
freshwater, and marine 
biomes. Scapes include 
large, intact wilderness 
spaces (blue circles), 
shared spaces (yellow 
circles), and anthromes 
(red circles). In spaces 
“shared” by people and 
biodiversity, the mosaic of 
intact natural habitat 
provides critical nature's 
contributions to people 
(NCP). Corridors of natural 
habitat (yellow arrows) 
facilitate species migration 
up elevational gradients. 
This multifunctional con- 
cept can assist integrating 
global and large-scale 
targets within local geogra- 
phies (18). 


appropriate. 
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Large intact natural areas in 
remote hills, mountains, savanna 
and ocean, supporting biodiversity 
and NCPs that teleconnect over 
large distances. Mix of protection 
and other effective conservation 
measures, governed by indigenous 
peoples, communities, property 
owners and/or government, as 


Corridors and mosaic of natural 
habitats enable climate migration: 


@ Forest ecosystems 
@ savanna ecosystems 
(3) Mountain slopes 

@ ocean ecosystems 


Corridors connect the mosaic of natural 
habitats in shared spaces with re V 
of nature in intact spaces. 


people’s access or cause loss of services on 
which their livelihoods depend, this can create 
negative impacts on adaptation (44). Social- 
ecological trade-offs and disconnects between 
the loss of local and “off-stage” (distant, dif- 
fuse, and delayed) biodiversity benefits can be 
minimized if the scale, type, and distribution 
of NCP are considered in restoration and off- 
setting initiatives (45, 46). 

Anew generation of conservation and resto- 
ration actions that focus on multifunctional 
connected “scapes” (Fig. 2) is necessary to 
address biodiversity and climate interactions 
and their nexus with human development, 
health, and well-being (47). Increasingly, bio- 
diversity management envisages a continuum 
of protection requirements from highly pro- 
tected areas, through shared scapes, to highly 
human-dominated scapes (48). This integra- 
tive approach aims at the maintenance of self- 
sustaining biodiversity that is functionally 
intact and resilient with respect to species com- 
position and in providing stable ecosystem 
services and that also meets human socioeco- 
nomic and cultural needs, including material 
provisioning (such as agriculture, forestry, and 
fisheries), nonmaterial contributions (such as 
cultural identity, spirituality and inspiration, 


ecosystems (center and 


intact/native habitat. 
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Varied mosaic of nature and 
people in shared spaces - in forest, 
savanna, and ocean, and varying 
from predominantly natural 
(adjacent to remote areas - sides 
and background) to predominantly 
modified, populated and managed life. 


and recreation), and regulatory contributions 
(such as pollination, water and soil protection, 
and microclimate). Such multifunctional scapes 
would include mosaics of protected areas, cli- 
mate refugia and migration corridors, spaces 
shared by biodiversity and humans and mod- 
ified by and for human use (including lands 
managed by Indigenous peoples and local 
communities), and profoundly transformed 
ecosystems such as urban and intensively farmed 
areas (Fig. 2) (47, 48). Implementing mixed- 
use land- and seascapes could enhance biodiver- 
sity and climate cobenefits and also generate 
diverse, inclusive, and equitable benefits to 
communities and people. In managed ecosys- 
tems, increasing crop diversity within fields 
(for example, by using varietal mixtures, inter- 
cropping, or agroforestry), adding temporal 
diversity through extended crop rotations, and 
diversifying agricultural practices at landscape 
levels dilutes risk and increases resilience to 
climate change (49, 50). In forests, manage- 
ment practices that promote biodiversity such 
as natural regeneration, mixed species stands, 
and selective harvests can also increase resil- 
ience and secure livelihoods (57, 52). 

The proportion and area of the different types 
of scapes would vary locally to meet different 


Heavily modified anthromes - 
cities, intensive farmland, 
modified coast, energy 
infrastructure. Minimize global 
footprint, assure local NCPs in 
5% of area for good quality of 


foreground). 20% of area under 
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local-to-global objectives for biodiversity, live- 
lihoods, and well-being, embedded in poly- 
centric governance models that have emerged 
to manage cross-sectoral complexity (48, 53). 
Critically, planning must be closely coordinated 
across local jurisdictions and at larger scales 
to jointly meet global goals and objectives (48). 
Equitable participation and leadership by In- 
digenous peoples and local communities is es- 
sential to these efforts, given their important 
and long-term role in managing biodiverse 
scapes and in successful conservation initia- 
tives (54, 55). Extending these conservation 
measures requires substantial scaling up of 
financial investment (56), connecting efforts 
across scapes (57), and seeking positive feed- 
backs between biodiversity conservation and 
social benefits and burdens (58). 

Key ecological and social safety measures 
require consideration when conceiving and 
implementing protection and restoration pro- 
grams. The mosaic scapes approach (Fig. 2) 
would combine protection and sustainable 
uses, considering human and ecosystem needs 
per region and ecosystem. Protection and con- 
servation initiatives work well when they in- 
volve stakeholder engagement and support, 
especially at the local scale, and where recog- 
nition of values and rights and the distribu- 
tion of cobenefits (and potential trade-offs 
and costs) are perceived as just (2, 26, 39). In- 
volvement in and access to restored areas is 
critical for local communities with ecosystem- 
based livelihoods and can support conserva- 
tion success. 

Conservation and restoration activities need 
to be paralleled by reducing excess human 
consumption habits and pollution. For exam- 
ple, measures across sectors that reduce the 
demand for energy through efficiency policies 
can reduce current emissions by more than 
50% (59). Sustained reductions and eventual 
elimination of waste and pollution can be en- 
abled through the progressive introduction of 
a circular economy as part of the envisioned 
holistic transformation that builds on system 
transitions toward sustainability in land, ocean, 
coastal, and freshwater ecosystems; urban and 
rural infrastructure; energy; industry; and so- 
ciety (2, 60). Conservation and restoration suc- 
cesses depend on limiting the degree of global 
warming, ideally to below 1.5°C (2). 


Nature-based solutions to mitigate 
climate change 


The protection, sustainable management, and 
restoration of ecosystems not only benefit 
biodiversity but can also aid climate change 
mitigation and other societal challenges (17, 61). 
Maximizing biodiversity protection and con- 
servation as well as contributing to climate 
change mitigation through nature-based so- 
lutions (with large adaptation cobenefits) en- 
tails urgently avoiding and reversing losses 
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and degradation of ecosystems (2, 62, 63). 
Protection and restoration of well-connected 
areas are not only key strategies for halting 
and reversing biodiversity loss (Fig. 2) (64, 65), 
they are also among the most rapidly imple- 
mented measures that can secure and rebuild 
carbon-rich ecosystems. Restoration also has 
long-term cobenefits—for example, water qual- 
ity and poverty reduction—especially when spa- 
tial planning is optimized for such criteria 
(66). Many locations best for protecting bio- 
diversity and nature’s contributions to peo- 
ple are coincident with currently high carbon 
storage and high capacity for ongoing seques- 
tration (18, 41, 49, 67). Terrestrial examples 
include intact tropical rainforest, wetlands, 
peatlands, grasslands, and savannahs (49). 
In the ocean, healthy mangrove forests, salt 
marshes, kelp forests, and seagrass meadows 
are important (47, 67), but so are undisturbed 
sediments, as well as deep water and newly 
colonized polar blue carbon habitats emerging 
from ice melt (68). The role, scale, and efficacy 
of carbon storage through nature-based sol- 
utions varies considerably with regional and 
local contexts (Table 1) (23, 69). 

Although ecosystem restoration is often 
more cost effective than technological mitiga- 
tion measures, nature-based solutions should 
not be mistaken as a substitute for substantial 
emissions reductions from energy, transpor- 
tation, agriculture, infrastructure, building, 
and industrial sectors (70). The rate at which 
nature-based solutions can contribute to mit- 
igation over time is too modest compared with 
current emission rates (67, 71). Nonetheless, 
because of substantial carbon storage over 
time (Table 1), they have an important role 
in long-term climate stabilization (71, 72). 

However, as a source of concern, the rate and 
effectiveness of nature-based solutions is con- 
strained by the degree of warming. Rising 
temperature progressively exceeds performance 
optima and adaptation limits of photosyn- 
thesis and its carbon assimilation in several 
ecosystems (following the principles outlined 
in Fig. 1). Warming thus constrains the poten- 
tial of nature-based solutions in terrestrial 
systems, exacerbated by drought (69, 73-75), 
and in some ocean systems (71, 76, 77). There- 
fore, climate change may even turn some eco- 
systems from carbon sinks to carbon sources. 
Maintaining a high capacity of carbon storage 
thus depends on limiting global warming to 
below 1.5°C, noting that the capacity of carbon 
binding and storage will decline and risks to 
carbon stocks increase with every 0.1°C on top 
of current warming (2, 67, 76). Warming by 
1.5°C will already weaken nature-based solu- 
tions and represents a threshold in the transi- 
tion from “safe” to “dangerous” climate change 
(2, 78). Restricting warming to 1.5°C through 
ambitious emissions reductions, together with 
reducing other and interacting pressures that 
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drive biodiversity decline, is thus a prerequisite 
for the success of conservation and restoration 
efforts involving natural and managed ecosys- 
tems (3, 23, 66, 71, 78-81) and a guardrail for 
avoiding catastrophic change (82). 

Provided emissions are successfully reduced, 
the effectiveness of nature-based solutions 
can be optimized when planned for longevity, 
when social equity is adequately addressed, 
and when solutions are not narrowly focused 
on rapid carbon sequestration (83) but also 
on cobenefits beyond mitigation. With care- 
ful planning and implementation, including 
spatial aspects and social safeguards, nature- 
based solutions can support multiple UN SDGs 
also in cities (84, 85). By enhancing ecosystem 
adaptive capacity and resilience, they can also 
reduce ecosystem degradation and the asso- 
ciated exacerbation of climate change. Higher 
genetic, species, and ecosystem diversity re- 
duces both social and ecological risks and keeps 
adaptation options open (86). 


Synergies and trade-offs in climate 
change mitigation, adaptation, 
and biodiversity conservation 


Biodiversity conservation and climate change 
mitigation actions that are well-managed and 
considered together tend to display more syn- 
ergies than trade-offs (Fig. 3). For example, 
both bioenergy and reforestation projects can 
be implemented carefully to contribute to hu- 
man development needs, alongside pronounced 
and rapid reductions in fossil fuel emissions 
(87, 88). The most robust path to limiting cli- 
mate change while safeguarding biodiversity 
depends on identifying the strongest win-win 
solutions by region and avoiding those with 
negative interactions (89). Management of 
natural or seminatural ecosystems through re- 
introduction of keystone species or altered 
wildfire frequency can strengthen biodiver- 
sity while enhancing climate mitigation and 
adaptation. However, although most actions 
to conserve biodiversity are positive or neutral 
for climate, some potential climate mitigation 
and adaptation actions will have negative ef- 
fects on biodiversity unless managed well (Fig. 
3). Both ecosystem protection and restora- 
tion may also interfere with food production 
unless carefully realized (24, 25). For wetland 
conservation, the quantification of carbon stor- 
age and greenhouse gas flux rates is uncertain 
with respect to the balance between methane 
(CH,) sources and CO, sinks (87, 90), but op- 
tions are available to reduce their large CH, 
emissions, such as by restoring tidal flows (97). 

Devoting vast land areas globally to the pro- 
duction of biomass for bioenergy is integral 
to many mitigation scenarios (49, 78). Yet un- 
intended negative side effects arise from their 
area requirements and competition for space, 
especially in terrestrial ecosystems, or from 
afforestation of natural grasslands (87, 92, 93). 
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ee 
Table 1. Examples for climate change mitigation aided through carbon uptake and storage with biodiversity cobenefits. 


System and type* 


Forest protection 


Forest restoration 
and reforestation 


Option 


Reducing deforestation 
and forest degradation 
Forest expansion (for 
example, on less than half 
of degraded tropical forest 
area (3.69 million km?) 


Mitigation potential 


Strengthening resilience, 


reducing soil tillage, and 
maintaining plant cover 


Grassland and savannah 
restoration and 
management 


Protecting and growing 
carbon stores 
Restoration of degraded 
grassland, rotational 
and light grazing 


Blue carbon habitats 


(coastal marine)§ 


Blue carbon habitats 


(coastal marine)§ 


Protection of sediments 
and rebuilding marine 
species stocks 


Protecting large sediment 
carbon stores, carbon export 
by abundant species stocks 


plankton to whales 


High latitude retreat of ice 
opens new habitat 


Biodiversity cobenefits Data source 


0.4 to 5.8 Gt COve year+ Soil and forest (18, 49, 81) 
vegetation preserved 
oo a er mo nM ae ee Tn ae 63 183) 
example by 2030) of vertebrate species 
Gt COze year 
Pee Re eR ee NE TO PO ato AN rte ere ee iris 2a aT eo 
long term 
WRC Ce Rice iy 
eae RCN ea ca a RCN Ser EOP TORENT COTE a ESO PEE Slob AMY cngurnnnnrnnnnne tected FOreSt SOS snes 
0.1 to 5.7 Gt COze year * Resilience building, (154, 155) 
healthy soils 
Pcie eo PE GGA OI OBL COe a ee inc dee Bi 156 157 
a Nera ea ketch At erste eae Pcee Aare ANE ON SLO a Pare reel eee inte TUONO ae aeons My eeRameee eral Coal CEU Meynee nat ane tec, 
Reducing wildfires, Protection of megafauna, (16, 158, 159) 
strengthens soil carbon, increase of grass-species 
2.3 to 8 Gt COze year! and soil biodiversity 
Re GEG SEC EE COR ae oe ‘iar 
higher by 2030 
Fe Ey OTe RTE a OP RE eT eT Ee SLICE 8 = CoG  aaceoy | 
TA aig DU ge aero Te HENCE attr GCN aes Mea ay Rn OSM ok Ora AN tut NOI see gu LUCIE Oy ae AMON ny a ae edi 
0.14 to 0.46 Gt COse year * Avoidance of coastal (41) 
habitat loss 
He dearer eetece re SOREL cane ee a 
habitat 
ee Ee ee a RE Ne Nore er LS eR Gere REET ReN A ee TCT 
pathways of carbon export, stocks, enhanced 
large long-term potential nutrient cycling 
Pe eS TR cr OT Soret ree eee i ee a a ee 
colonization coastal habitat 
Siti Se ae CO nc ie (ia 166) 


Sustainably managed 
cultures 


*Four types of nature-based solutions: (i) protection, (ii) restoration, (iii) management, or (iv) expansion of ecosystems in and around cities or across the wider landscape. 
fEmissions reductions supported by plant-based human diets and reduced methane emissions from livestock, such as by adding farmed red 


methane can offset carbon sequestration. 


Seaweed use and farming 


up to 0.24 Gt COze year” by 2050 


seaweed to ruminant diets, and from rice paddies and reduced nitrous oxide (N20) from fertilizer application (49, 81, 164, 166). 
and storage in marine systems that are amenable to management (167). 


Projected CO, uptake rates through bioen- 
ergy or monoculture forest regrowth by 2050, 
which are similar in magnitude to double 
today’s existing terrestrial carbon sink (49, 94), 
are unrealistic. Further, relying on tree bio- 
mass for long-term carbon sequestration is 
risky, particularly in monocultures with high 
vulnerability to heat, drought, storms, fire, 
or pest outbreak (95). Trade-offs of bioenergy 
crops or afforestation at large scales by using 
monoculture or nonindigenous species (96) 
cause displacement of other land uses locally 
or elsewhere. Associated risks include carbon 
and biodiversity losses, negative impacts on food 
security, as well as dispossession of local access 
to land and to a wide range of NCP (97-101). 
Deployment of renewable energy infrastruc- 
ture can substantially contribute to mitiga- 
tion, but hydropower, solar and wind energy, 
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and storage for these intermittent sources of 
energy can also have negative impacts on bio- 
diversity if their scale and design are not care- 
fully implemented (J02). For example, nearly 
4000 major dam projects are currently under- 
way or planned for energy production, irrigation, 
or flood control (703), but large dams generally 
have detrimental impacts on freshwater bio- 
diversity (J04) unless reduced through sustain- 
able management (105). Likewise, the negative 
impacts of wind turbines on bats and birds, and 
of solar energy on the terrestrial or aquatic areas 
they cover, can be reduced through careful 
placement and operation (106, 107). Techno- 
logical mitigation measures can also exert 
harm to the environment and to biodiversity 
through the vast amount of materials required, 
such as metals or toxic waste products (108), 
which argues for strong environmental and 
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tRelease of 


§Blue carbon encompasses all biologically driven carbon fluxes 


social sustainability criteria. As one safe alter- 
native, placing solar panels on half of the world’s 
roofs could meet humankind’s entire (as of 
2018) electricity demand (109). Expanding cir- 
cular economies will reduce the net use of 
materials, enable reuse, and avoid waste (10). 
Enhanced energy savings and efficiency also 
reduce the extent to which infrastructure needs 
to be deployed. 

Climate adaptation policies can also incur 
large biodiversity impacts. For example, build- 
ing sea walls to limit impacts of sea level rise 
on coastal infrastructure, adding irrigation ca- 
pacity to reduce climate change impacts on 
agriculture, or introducing exotic tree species 
in anticipation of increasing climatic stress on 
forests impose substantial risks of large ag- 
gregate losses of biodiversity from these ac- 
tions (86, 111). As such, there is an urgent need 
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Fig. 3. Positive and negative effects (Top) of actions to mitigate climate change on actions to reduce biodiversity loss and (Bottom) of actions reducing 


biodiversity loss on those mitigating climate change. Orange lines indicate potential negative effects, and b 
may shift over time and may surpass thresholds that trigger unforeseen consequences, positive or negative [ 


to assess the cumulative impacts of these adap- 
tation (and maladaptation) measures on bio- 
diversity, evaluate more positive alternatives 
(for example, those that rely on nature-based 
solutions), and plan deployment with biodiver- 
sity impacts explicitly accounted for (2, 86). 
In both land and ocean systems, options exist 
to manage trade-offs by combining nature-based 
solutions and technology-based measures for 
climate change mitigation and adaptation, 
while sustaining biodiversity (172). For exam- 
ple, solar panel fields integrated with cropping 
or grazing systems can create multiple bene- 
fits: Grazing underneath solar panels can en- 
hance soil carbon, and positive spillover effects 
into neighboring fields have been observed 
when habitat for pollinators is created under- 
neath solar panels (108). Solar power genera- 
tion on land is more efficient on an area basis 
than production of bioenergy crops and could 
thus contribute to reducing competition for 
land (113). Photovoltaic cells on the surface of 
water bodies can reduce evaporation, which 
could be beneficial to hydroelectric reservoirs 
in arid regions, considering that floating photo- 
voltaics may also affect the water body’s physical, 
chemical, and biological properties, particu- 
larly air exchange and thus oxygen supply (174). 
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Offshore wind in combination with hydrogen 
generation can be powerful for both mitiga- 
tion and biodiversity conservation if negative 
impacts on mobile species (such as birds) can 
be minimized (775); offshore turbines can cre- 
ate artificial substrate, with beneficial effects 
on marine biodiversity (106). 


Implementing a nexus approach 


Lack of appropriate governance mechanisms 
to anticipate and address the nexus between 
climate, biodiversity, and society can lead to 
the surpassing of biodiversity and climate 
tipping points with dire consequences for na- 
ture and society (2, 71, 78, 81). Cascading ef- 
fects, such as from melting of the polar ice 
sheets or forest dieback, can lead to breach- 
ing of socially acceptable limits and thresh- 
olds, especially regarding already vulnerable 
and marginalized societies (116). For example, 
reduced stability of crop yields may trigger 
food crises and subsequently lead to political 
instability, mass migration, and other crises 
(2, 81). The uncertainty in the projections of 
tipping point timing and locations suggests 
that the best approach to avoid them is to stay 
far away from even the lowest thresholds that 
would trigger irreversible change (117) and 
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ue lines indicate potential positive effects. Interactions 
after (18)]. 


implement quantified safety margins toward 
“safe and just” futures (118). These considera- 
tions emphasize the need for rapid progress 
in emissions reductions and biodiversity pro- 
tection (2), paralleled by rapid transformative 
action in all systems (2). 

Given the complexity of the nexus, there is 
thus a need for effective governance approaches 
that are appropriate for implementing and 
meeting protection and restoration goals, in- 
cluding multifunctional land, freshwater, and 
ocean scapes and providing equitable benefits 
that avoid trade-offs (78, 119). This requires a 
“nexus approach” to governance, which at a 
minimum would involve mainstreaming of 
biodiversity into climate policy and vice versa, 
and of both into initiatives to advance human 
development and good quality of life. Such an 
approach would also benefit from socioeco- 
logical systems perspectives, which identify 
interacting subsystems involving resources, 
users, and governance at multiple scales. Three 
key insights should inform related decisions 
and their implementation: first, all climate- 
biodiversity interactions have social consider- 
ations, with implications for both intra- and 
intergenerational equity and justice (120); sec- 
ond, telecoupling challenges abound, in which 
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off-stage effects also manifest far from the lo- 
cation of the intervention on different time 
scales (46, 127); and third, trade-offs and feed- 
backs among system components, as well as 
threshold effects and nonlinear outcomes for 
biodiversity, climate, and societal relationships 
are the norm rather than the exception (119). 

Taking a nexus approach can help guide 
policy interventions that have the potential to 
generate double or triple wins and avoid unin- 
tended consequences while facilitating more 
deep and rapid transformative change (119, 122). 
Assessing the range of viable solutions requires 
recognition of differences in both environmen- 
tal conditions and social and cultural situations 
and their vulnerabilities. For example, when 
defining suitable conditions for human society, 
ecosystems, their biodiversity, and the planet, 
the interrelationships between systems and 
between human, ecosystem, and planetary 
health need to be considered (2). Shifts in plan- 
etary functioning due to climate change (for 
example, responses of ocean currents and at- 
mospheric streams) influence regional and local 
climate regimes, which in turn set environ- 
mental conditions for human well-being as well 
as ecosystem functioning, on which human 
health and well-being depends (778). Cascading 
unfavorable outcomes (such as habitat loss) can 
affect all levels and systems (species, ecosys- 
tems, and humans), requiring in turn multi- 
scalar action (2). 

Just as environmental characteristics differ 
from place to place and requirements for bio- 
diversity differ across biomes, motivations, 
interests, preferences, and values embedded 
in institutions, societies, and cultures are also 
highly diverse (58, 123). Biodiversity conser- 
vation interventions that take into account 
this diversity, such as those that deliberately 
include Indigenous peoples and local commu- 
nities, have better outcomes on both human 
well-being and biological indicators (54). This 
observation calls for robust and transparent 
governance mechanisms aimed at inclusion 
that are able to overcome unequal power rela- 
tions among stakeholders. A shift toward co- 
ordinated bottom-up pathways may enable 
both the local granularity and the global co- 
herence needed to address nexus challenges 
across scales (37, 48, 124). Thus, it is crucial to 
identify interventions that are universal in 
terms of intent but sufficiently flexible and 
adaptive to different socioecological contexts, 
including historically and culturally embedded 
institutional and political governance struc- 
tures (125, 126). Some promising governance 
initiatives are already emerging, including juris- 
dictional approaches, experimental policy mixes, 
and rights-based approaches (127, 128). 

These approaches often stress the need for 
integrated and multifunctional interventions 
from toolboxes of adaptive solutions instead 
of single silver bullets. More innovative and 
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flexible governance approaches can also give 
room to local autonomy to choose between al- 
ternative pathways to transformative change 
(129). Yet there is currently insufficient cross- 
sectoral policy coherence and integration, as seen 
in separate UN conventions such as UNFCCC 
and CBD, which often debate the interests of 
national governments who in turn have their 
own siloed approaches. Overall, although in- 
tegrated solutions for the nexus exist that have 
cobenefits in terms of sustainable develop- 
ment and meeting basic needs of the poor and 
vulnerable (130), designing and financing these 
approaches faces numerous barriers because 
they require the cooperation of multiple actors 
across scales (119, 129, 131). 


Enabling urgent action for transformative change 


Limiting warming to 1.5°C or even to below 
2°C and halting biodiversity loss require rapid 
action, which entails transformative change 
through transformative governance. This in- 
cludes drawing on collaborative solutions across 
integrated systems, involves a broad range of 
actors and diversity of values about nature, 
engages different knowledge systems through 
more equitable approaches, and adaptively 
manages complex interactions (119, 132, 133). 
Transformative change implies deep shifts away 
from current ways of governing and decision- 
making that are often made on the basis of 
single-issue, short-term priorities. For example, 
current pledges of net zero emissions by compa- 
nies and countries run the risk of exacerbating 
inequitable outcomes if they push nature-based 
solutions that are inappropriate (such as tree 
planting in natural grasslands), have few so- 
cial benefits or increase vulnerability, or fail 
to be accompanied by simultaneous fossil fuel 
emissions reductions (67). Previous experiences 
from REDD+ programs (set up by UNFCCC to 
initiate “reduced emissions from deforesta- 
tion and degradation” of forests) show that 
interventions need to be codesigned across 
stakeholders to create fair compensation mech- 
anisms for equitable distribution of benefits 
and costs and to avoid risks to the most vul- 
nerable, otherwise the effectiveness can be com- 
promised (21, 49). 

Part of the move to transformative gover- 
nance entails better harnessing of positive social 
tipping points that, when reached and exceeded, 
can accelerate transformative change toward 
desirable biodiversity-climate interactions and 
help navigate through strong and often un- 
avoidable trade-offs within the nexus (Fig. 4) 
(134). Social tipping dynamics require syner- 
gizing technological, political, and behavioral 
processes toward structural reorganization, 
leading to targeted and “contagious” inter- 
ventions with positive impacts on climate, bio- 
diversity and human well-being (776, 135). Such 
interventions depend on social contexts, cul- 
tural or behavioral specificities, and political 
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as well as institutional systems (such as norms 
and regulations) (179). Ideally, goals for social 
tipping points should focus on structural re- 
organization through targeted interventions 
across institutions, technology, and behavior 
with positive impacts on climate, biodiversity, 
and human well-being (116, 135). Some exam- 
ples include the spread of carbon-neutrality 
pledges across businesses and cities, rapid adop- 
tion of new technologies such as electric vehicles, 
or the strengthening of climate and biodiver- 
sity education and civil society engagement 
(119, 136-138). Because the locations of tipping 
points are moving targets, owing partly to the 
interconnectedness of the climate-biodiversity- 
social system, social tipping interventions need 
to be flexible and adaptive (119, 134). Activat- 
ing such social tipping interventions necessar- 
ily involves a shift of individual and collectively 
shared social values away from individualism 
and materialism to principles such as respon- 
sibility, stewardship, and justice (59, 123). Such 
shifts in values will require structural actions as 
well, such as eliminating subsidies that pres- 
ently support fossil fuel uses (739) or activities 
that harm biodiversity such as deforestation 
(140-142) or overfishing (743); such subsidies 
currently prevent the take-off of alternative 
approaches and technologies that would im- 
mediately aid climate change mitigation by 
avoiding further emissions. Transformative 
governance would also entail creating coali- 
tions of support to confront recalcitrant and 
powerful interest groups (such as economic 
actors benefiting from fossil fuel subsidies) 
whose goal is keeping economic privileges 
and the political status quo and who lobby 
against transformative agendas (144). 
Transformative change requires the use of 
levers and decision points that have the poten- 
tial to alter future social-ecological trajectories 
(Fig. 4) (59). “Deep” leverage points (decision 
points that are harder to act on but have 
stronger potential effects) include moving 
away from valuing nature in narrow (market- 
based instrumental) ways and exploring alter- 
native collective visions of development and 
good quality of life, while empowering individ- 
uals to enhance a sense of collective responsibil- 
ity, especially toward more vulnerable natural 
systems and people (723). Interventions that 
activate deep leverage points could include (i) 
reconceptualizing economic development away 
from aggregate value of market flows (such as 
gross domestic product) to others that recog- 
nize a more inclusive understanding of changes 
in wealth (145) as well as the multiple values 
of nature (123) for a good quality of life within 
climate, biodiversity, and social limits; (ii) es- 
tablishing deliberative governance instruments 
to empower civil society to take decisions by 
emphasizing their role as citizens as opposed 
to consumers (123, 146); and (iii) recognition of 
Indigenous peoples’ and community conserved 
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Fig. 4. Biodiversity-climate interactions in social-ecological systems with 
an explicit depiction of resilience outcomes for alternative pathways. 

(A) Alternative pathways start from the present world and cross the opportunity 
space (B), reaching different levels of resilience depending on the success of 
transformative change (C). It is assumed that high resilience for biodiversity and 
ecosystems and human society is paralleled by low risk as a condition for 
social stability and human well-being. (D) Decision points reached over time and 


territories and areas, initiated, designed, and 
governed by local communities, along with 
full development of “other effective area-based 
conservation measures” (147) that can lead to 
the conservation of natural and modified eco- 
systems and their biodiversity and associated 
benefits, including climate benefits (69). 


Conclusions 


Achieving the scale and scope of transfor- 
mative change needed to concurrently meet 
the goals of the UNFCCC, the CBD, and the 
UN Agenda 2030 and its SDGs will rely on 
far-reaching mobilization and transformation 
actions of a type never before attempted, within 
a rapidly closing time window as required to 
keep global warming below 1.5°C and to se- 
cure a livable future (2). Healthy ecosystems 
and a healthy planet are preconditions for 
flourishing life on Earth. Strengthening bio- 
diversity in all systems will also support long- 
term climate stabilization. Securing human, 
ecosystem, and planetary health and societal 
well-being is facilitated through implementa- 
tion of multifunctional connected scapes and 
other integrated actions across the climate- 
biodiversity nexus, building on an analysis of 
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critical assets (148) and their protection and 
expansion through restoration. Securing a 
livable future will require rapid action and 
commitment not only from countries through 
actions in their national territories but also 
from emergent coalitions and governance mod- 
els at all levels. Addressing conventions in 
concert with one another, such as the Paris 
Agreement of the UNFCCC (alongside SDG 13) 
and the Kunming-Montreal Global Biodiver- 
sity Framework (together with SDGs 14 and 
15), will put society on the pathway to a posi- 
tive vision of good quality of life in harmony 
with nature (37). Engaging in deeper transfor- 
mative change will require effective incentives 
and social tipping points, capacity building 
with broad education and outreach initiatives, 
institutional change and improved coopera- 
tion across sectors and jurisdictions, and shifts 
in values to support intergenerational justice 
as well as equity more broadly. This vision of 
inclusive, integrative, and adaptive decision- 
making can overcome societal and political 
inertia (2, 3) and thereby help society to avoid 
crossing biophysical tipping points and their 
worst impacts, as well as help build a more 
just and sustainable world (2). 
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including positive social tipping interventions that can help shift toward positive 
biodiversity-climate interactions and associated social outcomes (2). [Figure 
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INTRODUCTION: Eukaryotes contain a highly 
conserved signaling pathway that becomes rap- 
idly activated when adenosine triphosphate 
(ATP) levels decrease, as happens during con- 
ditions of nutrient shortage or mitochondrial 
dysfunction. The adenosine monophosphate 
(AMP)-activated protein kinase (AMPK) is ac- 
tivated within minutes of energetic stress and 
phosphorylates a limited number of substrates 
to biochemically rewire metabolism from an 
anabolic state to a catabolic state to restore 
metabolic homeostasis. AMPK also promotes 
prolonged metabolic adaptation through tran- 
scriptional changes, decreasing biosynthetic 
genes while increasing expression of genes pro- 
moting lysosomal and mitochondrial biogenesis. 
The transcription factor EB (TFEB) is a well- 
appreciated effector of AMPK-dependent signals, 


AMPK active 
CCCP FNIP1 dephosphorylated 
Rotenone WT FNIP1 


but many of the molecular details of how AMPK 
controls these processes remain unknown. 


RATIONALE: The requirement of AMPK and its 
specific downstream targets that control aspects 
of the transcriptional adaptation of metabolism 
remain largely undefined. We performed time 
courses examining gene expression changes 
after various mitochondrial stresses in wild-type 
(WT) or AMPK knockout cells. We hypothesized 
that a previously described interacting protein 
of AMPK, folliculin-interacting protein 1 (FNIP1), 
may be involved in how AMPK promotes in- 
creases in gene expression after metabolic stress. 
FNIP1 forms a complex with the protein fol- 
liculin (FLCN), together acting as a guanosine 
triphosphate (GTP)-activating protein (GAP) 
for RagC. 
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Mitochondrial damage activates AMPK to phosphorylate FNIP1, stimulating TFEB translocation to the 
nucleus and sequential waves of lysosomal and mitochondrial biogenesis. After mitochondrial damage, 
activated AMPK phosphorylates FNIP1 (1), causing inhibition of FLCN-FNIP1 GAP activity (2). This leads to accumulation 
of RagC in its GTP-bound form, causing dissociation of RagC, mTORC1, and TFEB from the lysosome (3). TFEB 

is therefore not phosphorylated and translocates to the nucleus, inducing transcription of lysosomal or autophagy 
genes, with parallel increases in NT-PGCla. mRNA (4), which, in concert with ERRo: (5), subsequently induces 
mitochondrial biogenesis (6). CCCP, carbonyl cyanide m-chlorophenylhydrazone; CLEAR, coordinated lysosomal 
expression and regulation; GDP, guanosine diphosphate; P, phosphorylation. [Figure created using BioRender] 
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The FNIPI-FLCN complex has emerge| Che¢ 
an amino acid sensor to the mechanistic ta-,-— 
of rapamycin complex 1 (mTORC1), involved 
in how amino acids control TFEB activation. 
We therefore examined whether AMPK may 
regulate FNIP1 to dominantly control TFEB 
independently of amino acids. 


RESULTS: AMPK was found to govern expression 
of acore set of genes after various mitochondrial 
stresses. Hallmark features of this response 
were activation of TFEB and increases in the 
transcription of genes specifying lysosomal 
and mitochondrial biogenesis. AMPK directly 
phosphorylated five conserved serine residues in 
FNIP1, suppressing the function of the FLCN- 
FNIP1 GAP complex, which resulted in disso- 
ciation of RagC and mTOR from the lysosome, 
promoting nuclear translocation of TFEB even 
in the presence of amino acids. FNIP1 phos- 
phorylation was required for AMPK to activate 
TFEB and for subsequent increases in peroxisome 
proliferation-activated receptor gamma, coac- 
tivator 1-alpha (PGCla) and estrogen-related 
receptor alpha (ERRo) mRNAs. Cells in which 
the five serines in FNIP1 were mutated to ala- 
nine were unable to increase lysosomal and 
mitochondrial gene expression programs after 
treatment with mitochondrial poisons or AMPK 
activators despite the presence and normal 
regulation of all other substrates of AMPK. 
By contrast, neither AMPK nor its control of 
FNIP1 were needed for activation of TFEB 
after amino acid withdrawal, illustrating the 
specificity to energy-limited conditions. 


CONCLUSION: Our data establish FNIP1 as the 
long-sought substrate of AMPK that controls 
TFEB translocation to the nucleus, defining 
AMPK phosphorylation of FNIP1 as a singular 
event required for increased lysosomal and 
mitochondrial gene expression programs after 
metabolic stresses. This study also illuminates 
the larger biological question of how mitochon- 
drial damage triggers a temporal response of 
repair and replacement of damaged mitochon- 
dria: Within early hours, AMPK-FNIP1-activated 
TFEB induces a wave of lysosome and autoph- 
agy genes to promote degradation of damaged 
mitochondria, and a few hours later, TFEB- 
up-regulated PGCla and ERRa promote ex- 
pression of a second wave of genes specifying 
mitochondrial biogenesis. These insights open 
therapeutic avenues for several common dis- 
eases associated with mitochondrial dysfunction, 
ranging from neurodegeneration to type 2 dia- 
betes to cancer. 
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Cells respond to mitochondrial poisons with rapid activation of the adenosine monophosphate—activated 
protein kinase (AMPK), causing acute metabolic changes through phosphorylation and prolonged 
adaptation of metabolism through transcriptional effects. Transcription factor EB (TFEB) is a major 
effector of AMPK that increases expression of lysosome genes in response to energetic stress, but how 
AMPK activates TFEB remains unresolved. We demonstrate that AMPK directly phosphorylates five 
conserved serine residues in folliculin-interacting protein 1 (FNIP1), suppressing the function of 

the folliculin (FLCN)-FNIP1 complex. FNIP1 phosphorylation is required for AMPK to induce nuclear 
translocation of TFEB and TFEB-dependent increases of peroxisome proliferator—activated receptor 
gamma coactivator 1-alpha (PGC1q) and estrogen-related receptor alpha (ERRa) messenger RNAs. Thus, 
mitochondrial damage triggers AMPK-FNIP1—-dependent nuclear translocation of TFEB, inducing 
sequential waves of lysosomal and mitochondrial biogenesis. 


he ability to adapt to prolonged nutrient 

deprivation is an essential characteristic 

for survival of all organisms. Eukaryotes 

contain a highly conserved signaling 

pathway that becomes rapidly activated 
when adenosine triphosphate (ATP) levels in 
the cell decrease, most often as a result of loss 
of mitochondrial ATP production caused by 
decreased oxygen or glucose concentrations 
or in response to mitochondrial poisons that 
directly interfere with oxidative phosphoryl- 
ation (OXPHOS) (J). The adenosine mono- 
phosphate (AMP)-activated protein kinase 
(AMPK) becomes activated fully within minutes 
of OXPHOS inhibition and rapidly phospho- 
rylates a limited number of direct substrates 
that regulate lipid and glucose metabolism, 
autophagy, and mechanistic target of rapa- 
mycin complex 1 (mTORC1) signaling (2, 3). 
If energetic stress is prolonged, metabolism is 
further modified by transcriptional changes 
to gene expression programs governing dis- 
tinct metabolic processes (4). The transcrip- 
tion factor EB (TFEB) and related TFE3 are 
activated in response to nutrient deprivation 
and energetic stress, and both are suppressed 
by mTORCI signaling (5-9) and activated by 
AMPK signaling (10, 11). mTORC1 directly 
phosphorylates TFEB on Ser’™”, Ser™?, and 
Ser", resulting in its exclusion from the nu- 
cleus (6). 
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Amino acids (AAs) regulate the ability of 
mTORCI to phosphorylate TFEB through a 
guanosine triphosphate (GTP)-activating pro- 
tein (GAP) complex composed of the folliculin 
(FLCN) and FLCN-interacting protein 1 (FNIP1) 
proteins (12-75). The FLCN-FNIP1 complex dic- 
tates GTP loading of the guanosine triphos- 
phatase (GTPase) RagC, which results in the 
release of TFEB and TFE3 from the lysosome 
and away from mTORC1, causing their nuclear 
translocation (/4). In the nucleus, TFEB and 
TFE3 directly bind to a well-defined DNA bind- 
ing element [coordinated lysosomal expression 
and regulation (CLEAR) motif] found conserved 
in the proximal promoters of >50 components 
of the lysosome and a number of autophagy 
genes (16, 17). In this way, TFEB and TFE3 are 
held inactive under nutrient-replete conditions, 
but in response to specific cellular stresses, they 
translocate into the nucleus and promote lyso- 
somal biogenesis and autophagy (78). AMPK 
is required for TFEB translocation to the nu- 
cleus during energetic stress (J0, 11, 19). How 
AMPK activates TFEB remains unknown, but 
it is presumed to rely on AMPK-dependent sup- 
pression of mTORC1 through its established 
phosphorylation of the mTORC1 component 
Raptor and upstream regulator TSC2. We ex- 
amined the possibility that TFEB and TFE3 
were direct substrates of AMPK, but we did 
not find any evidence supporting this in vivo. 
Recently, AMPK has been reported to phos- 
phorylate TFEB at three C-terminal residues, 
and these phosphorylation events may have a 
role in lysosomal gene transcription (20). How- 
ever, all three of these tightly clustered sites 
poorly match the AMPK substrate consensus 
motif, and mutation of these sites does not 
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disrupt the ability of AMPK to induce nuclear 
translocation of TFEB and TFE3 (20). Thus, an 
unknown AMPK-dependent event may govern 
the translocation of TFEB and TFE3 to the nu- 
cleus, without which CLEAR gene transcrip- 
tion cannot occur, even if TFEB and TFE3 are 
phosphorylated at their C terminus by AMPK- 
dependent signals. 

The precise mechanism of how AMPK ac- 
tivates the transcriptional program of mito- 
chondrial biogenesis is also unresolved. AMPK 
controls mitochondrial biogenesis and synthe- 
sis of peroxisome proliferator-activated recep- 
tor gamma coactivator 1-alpha (PGCla) mRNA 
in response to energetic stress (21-24). Several 
mechanisms for how AMPK may promote ac- 
cumulation of PGCla mRNA and PGCla func- 
tion have been proposed, including control of 
PGCloa phosphorylation or acetylation (25, 26), 
but the precise mechanism or mechanisms re- 
main poorly understood. We performed time 
course analysis of the transcriptional response 
to mitochondrial OXPHOS inhibitors, reveal- 
ing a temporal cascade of organellar bioge- 
nesis genetically dependent on AMPK. We 
identified FNIP1 as a direct AMPK substrate 
whose phosphorylation is critical for TFEB 
activation and nuclear translocation, which 
in turn leads to the production of PGCla and 
estrogen-related receptor alpha (ERRo.) mRNAs, 
resulting in a wave of lysosomal biogenesis fol- 
lowed by mitochondrial biogenesis. 


Electron transport chain inhibitors require 
AMPK to induce mitochondrial gene 
transcription 


To delineate the role of AMPK in the transcrip- 
tional response to mitochondrial energetic 
stress, we disrupted AMPKol and AMPKo2 
in human embryonic kidney 293T (HEK293T) 
cells [AMPK knockout (KO)] by CRISPR-Cas9 
and subjected wild-type (WT) control and 
AMPK KO cells to electron transport chain 
(ETC) inhibitors, including 100 ng/ml rotenone 
(complex I), 2 mM phenformin (complex I), and 
5 uM CCCP (carbonyl cyanide m-chlorophenyl- 
hydrazone; a protonophore that disrupts the 
electrochemical gradient required for ATP 
production). We performed a time course of 
RNA sequencing (RNA-seq) of up to 16 hours. 
Differential expression analysis of all genes in 
the RNA-seq dataset revealed the gene expres- 
sion patterns induced by each of the drugs. 
Differential expression defined by fold change 
(FC) = 1.3 and P < 0.05 (P values generated 
from ¢ tests were corrected using the Benjamini- 
Hochberg method) followed by hierarchical 
clustering uncovered a set of common genes, 
whose expression was induced by CCCP, rote- 
none, and phenformin in WT controls but not 
in cells lacking AMPK (fig. S1 and Fig. 1A). Ap- 
proximately one-third of those genes with 
transcription increased similarly by the three 
ETC inhibitors in the first 16 hours required 
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Fig. 1. Dominant role of AMPK in the 
transcriptional response to mito- 
chondrial poisons through the MiT-TFE 
family of transcription factors. 
RNA-seq analysis of WT and CRISPR- 
Cas9-mediated AMPK KO HEK293T cells 
upon O- to 16-hour treatment with the 
mitochondrial poisons CCCP (5 uM), 
rotenone (100 ng/ml), phenformin 

(2 mM), and the AMPK-specific activation 
drug 991 (50 uM). (A) Unbiased heatmap 
displaying gene expression pattern of 

all AMPK-dependent, differentially 
expressed (DE) genes (FC 2 1.3, P< 
0.05) commonly regulated by all 

three mitochondrial poisons and 991. 
(B) Stacked Venn diagram showing the 
proportion of DE CCCP-induced genes 
that require AMPK. (C) GSEA analysis 
shows significantly up-regulated GTRD 
(ChIP-seq—based Gene Transcription 
Regulation Database) transcription 
actor targets upon CCCP treatment. 
(D) Gene clustering analysis and heatmap 
displaying the expression pattern of all 
mitochondria-specific genes as defined by 
the Mitocarta 3.0 inventory. Right heat- 
map is a zoomed-in view of the AMPK- 
dependent mitochondrial genes induced 
by the four drugs. (E) Overlap in 
egulation of AMPK-dependent 
mitochondrial genes in (D) by CCCP, 
otenone, phenformin, and 991. (F) Vol- 
cano plot depicting DE mitochondria 
genes from (D) after 991 compared with 
DMSO. Red dots represent genes signifi- 
cantly induced by 991 compared with 
DMSO. The y axis denotes —logio P values, 
and the x axis shows logs FC values. 
(G) Volcano plot denoting differentia 
expression of mitochondrial genes 
between WT 16-hour 991-treated cells 
compared with AMPK KO 16-hour 
991-treated cells. Blue dots represent 
genes significantly down-regulated by 
AMPK deletion compared with WT A\ 
condition. The y axis denotes —logio 
P values, and the x axis shows logs 
FC values. (H to J) Quantitative RT-PCR 
(qRT-PCR) for lysosomal gene 

Lamp2 (H), mitochondrial genes 

IDH2 (1), and Cox6A1l (J) in WT and 
AMPK KO HEK293T cells after CCCP. 
(K to M) qRT-PCR for lysosomal gene 
Lamp2 (K) and mitochondrial genes 
KO HEK293T cells afte 


PK 


means + SEMs. *P < 0.05; **P < 0.01; ***P < 0.001; ****P < 0.0001: 
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Is treated with DMSO, 991, or phenformin for 1 hour. 
(0) Analysis of AMPK, TFEB, and mitochondrial protein immunoblotting of WT of 
and AMPK KO HEK293T cells treated with a rotenone (100 ng/ml) time course, 
ranging from 0 to 24 hours. (P) Analysis of AMPK, TFEB, and mitochondria 
immunoblotting of WT and AMPK KO HEK293T cells treated with a 991 (50 uM) 
time course, ranging from O to 24 hours. (Q) Analysis of TFEB and TFE3 protein 
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AMPK for full gene induction (Fig. 1, A and B). 
Analysis of this common, AMPK-dependent, 
and mitochondrial energetic stress-induced 
gene set at 2- and 4-hour time points—using 
the Gene Transcription Regulatory Database 
(GTRD) of chromatin immunoprecipitation 
sequencing (ChIP-seq) datasets—revealed TFEB 
to be the most enriched transcription factor 
(Fig. 1C) based on the binding of transcription 
factors to the -1000 to +100 bases around the 
transcription start site of each differentially 
expressed gene (27). 

Volcano plots illustrating genes with the 
greatest FC and highest degree of statistical 
significance revealed core components of lyso- 
somes and mitochondria as some of the most 
increased genes in an AMPK-dependent man- 
ner in cells exposed to CCCP for 16 hours (fig. 
$2, A and B); for example, EPDR1I (ependymin 
related 1) is a lysosomal protein that binds 
gangliosides, OXCTI [3-oxoacid coenzyme A 
(CoA)-transferase 1] encodes a mitochondrial 
enzyme in ketone body catabolism, and ACSS3 
(acyl-CoA synthetase short chain family mem- 
ber 3) encodes a mitochondrial enzyme in fatty 
acid oxidation. Gene ontology (GO) enrichment 
analysis on AMPK-responsive genes in cells 
treated with CCCP revealed that “lysosomal 
lumen” and “secretory granule lumen” were 
among the most enriched GO organelle terms 
of the AMPK-dependent, CCCP-induced gene 
set, and “Electron Transport Chain (OXPHOS)” 
was the second-most enriched WikiPathways 
term (fig. S2C). Gene set enrichment analysis 
(GSEA) of the CCCP-induced RNA-seq dataset 
confirmed enrichment of “KEGG Lysosome” 
and “Hallmarks OXPHOS” gene sets in WT cells 
but not in cells lacking AMPK treated with 
CCCP for 16 hours (fig. S2D). Parallel analysis 
of the genes differentially expressed in re- 
sponse to rotenone in an AMPK-dependent 
manner revealed similar enrichment in GO 
for lysosomes and enrichment for mitochon- 
drial processes in WikiPathways analyses, again 
with statistically significant enrichment for 
the “KEGG Lysosome” and “Hallmarks OXPHOS” 
gene sets (fig. S2, E and F). In addition to ob- 
serving a common transcriptional response 
between CCCP, rotenone, and phenformin, we 
also found that the synthetic small-molecule, 
direct AMPK activator 991 (28) similarly enriches 
for the same lysosomal genes and mitochon- 
drial targets (fig. S2, G and H). Transcription 
factor enrichment analysis of 991-induced 
genes in the WT condition identified ESRRA 
as a transcription factor, whose targets are over- 
represented in the up-regulated genes (fig. 
$21). The estrogen-related receptor alpha gene, 
ESRRA, which encodes ERRa, is a known key 
mediator of mitochondrial biogenesis (29, 30). 

Having observed a strong lysosomal and 
mitochondrial component to the common dif- 
ferentially expressed genes upon treatment with 
all four drugs, we next examined mitochon- 
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drial genes more comprehensively. We rean- 
alyzed our data, using the MitoCarta version 
3.0, a curated catalog of ~1000 genes encoding 
the mammalian mitochondrial proteome (31), 
to assess mitochondrial biogenesis. There was 
significant overlap in the mitochondrial genes 
increased after CCCP, rotenone, and CCCP 
with those increased by 991 (Fig. 1, D and E). 
The analysis revealed increased transcription 
of ~300 mitochondrial genes in WT cells treated 
with ETC poisons or small-molecule, AMPK 
activator 991 but not cells lacking AMPK (Fig. 1, 
D, F, and G). Despite the differences in the 
mechanisms by which these compounds in- 
hibit the ETC, this analysis indicates that a 
large proportion of the transcriptional response 
mediated by ETC poisons requires AMPK ac- 
tivation. We validated the increased expres- 
sion of core lysosomal and mitochondrial genes, 
including LAMP2 (Fig. 1H), IDH2 (Fig. 11), 
Cox6Al1, or ACO2 (Fig. 1J), in response to CCCP 
and 991 (Fig. 1, K to M), all of which showed 
increased expression in WT cells but not those 
lacking AMPK when assessed by quantitative 
polymerase chain reaction (qPCR). 

Given the prominent role for TFEB implied 
by the regulation of genes induced by mito- 
chondrial poisons (Fig. 1C) and TFEB’s major 
role in AMPK-dependent effects on transcrip- 
tion (9, 10, 17), we examined the regulation of 
TFEB protein in WT or AMPK KO cells treated 
with mitochondrial poisons. TFEB isolated 
from 991- (1 hour) or phenformin- (1 hour) 
treated WT cells showed a reduced mobility 
band-shift of endogenous TFEB, which did 
not occur in TFEB from AMPK KO cells (Fig. 
1N). We examined TFEB electrophoretic mobil- 
ity over longer time courses of treatment with 
rotenone and 991 and again observed a per- 
sistent downward mobility shift in WT cells 
between 2 and 24 hours after treatment, which 
was barely detectable in the AMPK KO cells 
(Fig. 1,O and P). We examined protein abun- 
dance of some mitochondrial targets im- 
plicated by RNA-seq or qPCR and observed 
up-regulation of PDHA1 and IDH2 after 991 
or rotenone, which was not observed in AMPK 
KO cells (Fig. 1, O and P). 

We assessed the localization of TFEB and 
TFE3 through nucleo-cytoplasmic fractionation. 
In WT cells, 991 treatment led to increased 
abundance of TFEB and TFE3 in the nuclear 
fraction with almost none remaining in the 
cytoplasmic compartment (Fig. 1Q). In cells 
lacking AMPK, 991 treatment did not lead to 
translocation of TFEB or TFE3 to the nuclear 
fractions. Immunofluorescence microscopy of 
intracellular TFEB with an antibody that de- 
tects endogenous TFEB confirmed these re- 
sults. In WT cells under basal conditions, TFEB 
was mostly in the cytoplasm with low amounts 
in the nucleus (Fig. 1, R and S). Upon AMPK 
activation by 991, conditions under which TFEB 
is fully dephosphorylated, almost all TFEB trans- 
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located to the nucleus. Conversely, in AMPK KO 
cells, with or without 991, most TFEB remained 
in the cytoplasm (Fig. 1, R and S). These re- 
sults indicate that TFEB is a major effector 
of AMPK after mitochondrial energetic stress 
(Fig. IT). 


FNIP1 is a conserved AMPK substrate 
that governs TFEB phosphorylation status 
and localization 


A screen we performed looking for AMPK 
substrates that could mediate cell growth and 
metabolism (32) identified FNIP1. FNIP1 is 
an established interacting partner of FLCN 
and has been reported to coimmunoprecipi- 
tate with AMPK and be regulated by AMPK 
through unknown details (33, 34). The FNIPI1- 
FLCN complex has emerged as an AA sensor 
to mTORCI (//-/4), involved in how AAs con- 
trol TFEB activation (14). We therefore exam- 
ined whether AMPK may regulate FNIP1 to 
control TFEB independently of AAs. Analysis 
of the FNIP1 protein sequence revealed four 
sites that match the optimal AMPK substrate 
motif (Ser™°, Ser””, Ser”, and Ser”®?) (Fig. 2A). 
We used mass spectrometry (MS) to exam- 
ine the phosphorylation of FNIP1 in vivo. We 
transfected HEK293T cells with an epitope- 
tagged FNIP1 cDNA, and the cells treated with 
dimethyl sulfoxide (DMSO) or phenformin. We 
detected peptides spanning all four (Ser*?™ 
Ser?*?, Ser?®", and Ser?) of these FNIP1 can- 
didate sites (Fig. 2B), phosphorylation of which 
had increased in the phenformin-treated sam- 
ples. MS analysis detected a fifth site, Ser?”°. 
This also appeared to be highly phosphorylated 
after AMPK activation and largely conforms 
to the optimal motif (Fig. 2A), indicating that 
it too may be a site of AMPK phosphorylation 
(Fig. 2B). 

To test whether FNIP1 is a direct substrate of 
AMPK, we undertook an in vitro ®’P-y-MgATP 
phosphorylation assay. We immunopurified 
FLAG epitope-tagged WT FNIP1 or FNIP1 S-A 
mutant proteins (Fig. 2C) after transient trans- 
fection of cDNA into HEK293T cells. Recom- 
binant AMPK phosphorylated WT FNIP1 and 
mutation of the five candidate AMPK-site 
serines (Ser™°, Ser”®”, Ser*™!, Ser®®?, and Ser””°) 
showed the greatest effect, seen when all five 
sites were mutated (hereafter referred to as 
the SA5 mutant) (Fig. 2C). Notably, variability 
in the behavior and lack of effect on overall ®’P 
incorporation of some mutants in the in vitro 
phosphorylation assay leaves the possibility 
open that some of the sites are not directly 
regulated by AMPK or are redundantly phos- 
phorylated, masking the impact of loss of one 
or two sites as judged by overall ®’P incorpo- 
ration in vitro. To verify FNIP1 as an in vivo 
substrate of AMPK, we generated a phospho- 
specific antibody, targeting the pFNIP1 Ser?”° 
site. To validate the Ser””° antibody, we used 
HEK2938T cells lacking FNIP1 generated by 
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CRISPR-Cas9 methodology (fig. S3A) and then 
transiently transfected those cells with cDNA 
for WT FNIP1, S220A, SA3 (S-A mutations of 
Ser?” Ser??, and Ser”), SA4 (S-A mutations 


phorylation of the $220 site in WT FNIPI, SA3, 
and SA4 FNIP1 conditions but not in the 
S220A or SA5 condition, in which the Ser?”° 
site had been mutated to alanine (fig. S3B). A 


of Ser”*°, Ser”®?, Ser“, and Ser°®*), and SA5 
FNIP1 mutants. These cells were treated with 
either DMSO or 991, and then phosphorylation 
status of FNIP1 was assessed with the pFNIP1 
Ser*”° antibody. The antibody detected phos- 
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phospho-specific antibody was also generated 
to the Ser” site, which was only effective after 
FNIP1 immunoprecipitation; nonetheless, we 
observed increased Ser?” phosphorylation in 
cells overexpressing WT FNIP1 but not SA5 
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FNIPI, in which Ser?® is mutated to alanine 
(fig. S3C). 

To assess phosphorylation of endogenous 
FNIP1, we treated mouse embryonic fibroblasts 
(MEFs) with 991. Endogenous FNIP1 was ro- 
bustly phosphorylated at the Ser?”° site within 
30 min, whereas TFEB and TFE3 were dephos- 
phorylated under the same conditions in WT 


but not AMPK KO MEFs (Fig. 2D). We also 
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examined phosphorylation of endogenous 
FNIP1 in liver lysates of mice after treatment 
of mice with MK-8722, an orally available 
991 analog that also activates AMPK (35). We 
detected Ser”*° phosphorylation of FNIP1 in 
livers from WT mice treated with MK-8722 
but not in livers from AMPKal/a2 (Prkaal 1 
Prkaa2’ 4) liver-specific KO mice (Fig. 2E). 
Other AMPK substrates, such as pRaptor 
Ser’°?, were also not phosphorylated in these 
mice. TFEB and TFE3 were dephosphory- 
lated in livers from WT animals treated with 
MK-8722 but not in livers lacking AMPK 
(Fig. 2E). Treatment of primary hepatocytes 
with metformin induced phosphorylation of 
endogenous FNIP1 at Ser?”° in WT hepato- 
cytes but not hepatocytes lacking AMPK 
(Fig. 2F). TFEB and TFE3 became dephos- 
phorylated in WT hepatocytes treated with 
metformin but remained phosphorylated in 
hepatocytes lacking AMPK, even after met- 
formin administration (Fig. 2F). Thus, FNIP1 
appears to be a bona fide, in vitro and in vivo 
substrate of AMPK, phosphorylated upon di- 
rect AMPK stimulation and in response to 
mitochondrial energetic stress. 

Given that both mTOR and AMPK antago- 
nistically regulate TFEB and TFE3 transcrip- 
tion factors and that both pathways converge 
on the FNIP1-FLCN complex, phosphorylation 
of FNIP1 by AMPK could represent the dom- 
inant mechanism through which AMPK con- 
trols these transcription factors. To test for 
such a role for FNIP1, we used lentivirus to 
stably reconstitute cells depleted of FNIP1 with 
full-length cDNA encoding either WT FNIP1, 
SA4 FNIPI, or SA5 FNIP1 and treated these cells 
with 991 (Fig. 2G) or phenformin (fig. S3D). In 
WT FNIPI cells, AMPK activation led to de- 
phosphorylation of TFEB and TFE3 within 
30 to 60 min of 991 or phenformin addition 
(Fig. 2G and fig. S3D). However, in the SA4 and 
SA5 FNIP!1 cells, TFEB and TFE3 remained in 
the slow-mobility hyperphosphorylated form 
whether or not cells were treated with 991 or 
phenformin (Fig. 2G and fig. S3D). This effect 
was further enhanced when all five AMPK 
sites on FNIP1 were mutated. TFEB was fully 
phosphorylated in FNIP1 SA5 cells, even in 
the presence of phenformin or 991 (Fig. 2G 
and fig. S3D). In our prior studies, AMPK pro- 
moted activation of TFEB in cells deprived of 
glucose, but the mechanism was unknown 
(11). To test whether AMPK control of FNIP1 
was involved in the regulation of TFEB after 
glucose starvation, we examined TFEB in WT 
FNIP1 and SA5 FNIP!1 cells deprived of glu- 
cose. Six hours of glucose deprivation activated 
AMPK, leading to FNIP1 Ser’”° phosphorylation 
in WT FNIPI cells and TFEB dephosphoryl- 
ation, as did treatment with 991 (fig. S3E). 
Conversely in SA5 cells, mutation of AMPK 
phosphosites on FNIP1 prevented TFEB de- 
phosphorylation induced by glucose starvation 
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(fig. S3E). Furthermore, given the importance 
of Ser””° as the one serine different between 
SA4 and SA5 and the apparent in vitro phos- 
phorylation of Ser®®? (Fig. 2C), we examined 
the effect of a Ser?”°-Ser®*? AA mutant (SA2), 
which revealed incomplete regulation of TFEB, 
again implying a critical role for all five serines, 
including 230, 232, and 261 (fig. S3F). 

If AMPK phosphorylation of FNIP1 is nec- 
essary for TFEB dephosphorylation, local- 
ization of TFEB and TFE3 might be expected 
to change upon FNIP1 phosphorylation. To test 
this, we isolated nuclear and cytoplasmic frac- 
tions from WT FNIPI cells treated with 991. 
In contrast to the control, after 991, TFEB was 
enriched in the nucleus with little cytoplasmic 
TFEB detected (Fig. 2H). However, after 991 
treatment in SA5 FNIP!1 cells, most TFEB re- 
mained in the cytoplasm (Fig. 2H). Immuno- 
fluorescence microscopy to visualize endogenous 
TFEB and TFE3 confirmed this observation. In 
WT FNIPI cells in fresh media, both endoge- 
nous TFEB (Fig. 2, I and J) and endogenous 
TFE3 (fig. S3, G and H) displayed a fully cyto- 
plasmic localization, but upon 991 treatment, 
both translocated to the nucleus despite the 
presence of full AAs. Conversely, in SA5 FNIP1 
cells, AMPK activation by 991 did not cause 
translocation of TFEB or TFE3 to the nucleus, 
and both were primarily cytoplasmic (Fig. 2, I 
and J, and fig. $3, G and H). Thus, phosphoryl- 
ation of FNIP1 by AMPK appears to dominantly 
govern localization of TFEB (fig. S31). 


AMPK-dependent phosphorylation of 
FNIP1 controls mTORC1 binding to TFEB 
and TFE3 


Although TFEB is phosphorylated by extracel- 
lular signal-regulated kinase 1 (ERK), glycogen 
synthase kinase-3 (GSK3), and protein kinase 
B (PKB or Akt), much regulation of TFEB is 
thought to be through phosphorylation of 
Ser’? and Ser*” by mTORCI (6). To delineate 
TFEB regulation by AMPK versus mTORCI1, we 
treated WT HEK293T, AMPK KO HEK293T, 
WT FNIPI, and SA5 FNIP1 HEK293T cells with 
991. TFEB and TFE3 became dephosphorylated 
in WT cells and WT FNIPI cells within 30 min 
but remained highly phosphorylated in AMPK 
KO and SA5 FNIP!1 cells (Fig. 3, A and B). 
mTORCI signaling was attenuated by 991 in 
WT HEK2938T and WT FNIP1 HEK293T cells 
within 30 min of treatment, as reflected by 
the decreased phosphorylation of the direct 
mTORC1 substrates eukaryotic translation 
initiation factor 4E (eIF4E)-binding protein 
1 (4EBP1) and ribosomal protein S6 kinase 
beta-1 (P70S6K), and in turn, the P70S6K sub- 
strate ribosomal protein S6 (S6). By contrast, 
in AMPK KO HEK293T cells, mTORC1 signaling 
did not change with 991 treatment, because 
of the absence of AMPK (Fig. 3A). However, 
in both WT and SA5 FNIP1 cells, canonical 
mTORC1 signaling to P-S6K and 4EBP1 was 
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suppressed upon AMPK activation with 991, 
whereas TFEB and TFE3 remained fully phos- 
phorylated in the SA5 cells (Fig. 3B). This re- 
vealed a disconnect where FNIP1 modification 
affected mTORC1-dependent phosphorylation 
of TFEB but not mTORC1 phosphorylation of 
S6K1 and 4EBP1, consistent with multiple re- 
cent studies (15, 36, 37). 

We examined the effect of AA abundance on 
TFEB phosphorylation status and mTORC1 
activity in the context of our FNIP1 phospho- 
rylation site mutants. mTORC1 signaling, as 
reflected by phosphorylation of p70S6K, S6, 
and 4EBP1, was decreased in both WT and 
AMPK KO HEK293T cells deprived of AAs 
(Fig. 3C). In cells deprived of AAs, TFEB and 
TFE3 became dephosphorylated in both WT 
and AMPK KO cells (Fig. 3C), contrasting with 
cells treated with 991, where TFEB and TFE3 
became dephosphorylated in WT but not AMPK 
KO HEK2998T cells (Fig. 3A). Similarly, mTORC1 
signaling was decreased after AA deprivation 
in both WT FNIP1 and SA5 FNIP!1 cells. TFEB 
and TFE3 also became dephosphorylated when 
both WT FNIP1 and SA5 FNIP!1 cells were de- 
prived of AAs for 1 hour (fig. S4A), whereas 
activation of AMPK with 991 caused TFEB to be 
dephosphorylated in WT FNIP1 cells but not 
SA5 FNIP1 cells. These findings reveal that AMPK- 
FNIP1-mediated control of the micropthalmia- 
transcription factor E (MiT-TFE) family of tran- 
scription factors through mTORC1 is distinct 
from that caused by a shortage of AAs. How- 
ever, dephosphorylation of TFEB and TFE3 
dephosphorylation in the SA5 FNIP1 cells de- 
prived of AAs was slower than that observed 
in WT FNIP!1 cells, which demonstrates the 
inherent control that FNIP1 exerts on TFEB 
and TFE3 in general (fig. S4A). 

To clarify whether TFEB phosphorylation 
remaining in SA5 cells after AMPK activation 
was from uninhibited mTORC1-dependent phos- 
phorylation or an unknown kinase, we treated 
WT and SA5 FNIPI cells with two potent and 
selective inhibitors of mTOR, AZD8055 and 
Torin1, with or without 991. Both AZD8055 
and Torin1 decreased TFEB phosphorylation 
in cells expressing SA5 FNIP1, demonstrating 
that the TFEB hyperphosphorylation observed 
in these cells was indeed a result of mTORC1 
(Fig. 3D). Although phosphorylation of other 
mTOR substrates, such as S6K1 or 4EBP1, were 
not affected by mutation of AMPK sites on 
FNIP1, TFEB specifically was highly regulated 
by the AMPK-dependent phosphorylation of 
FNIP1. To further understand the regulation 
of TFEB in this context, we generated WT FNIP1 
and SA5 FNIP1 HEK293T cells that stably ex- 
press TFEB tagged with green fluorescent 
protein (GFP) and subjected these cells to 991 
treatment (Fig. 3E). GFP-TFEB immunopreci- 
pitates from WT FNIP1 cells revealed that 
AMPK activation by 991 caused TFEB dissoci- 
ation from mTOR and Raptor. Conversely in 
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SA5-FNIPI cells, 991 did not have this effect, 
and mTOR and Raptor remained associated 
to GFP-TFEB (Fig. 3E). Although mTORC1 
signaling to S6K1 and 4EBP1 is decreased by 
AMPK activation in SA5-FNIP1 cells, it appears 
that TFEB is constitutively phosphorylated 
in cells expressing SA5 FNIP1 treated with 
991 because of mTORCI remaining associated 
with TFEB. 
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AMPK phosphorylation of FNIP1 inhibits 
FNIP1-FLCN GAP activity to control TFE 
transcription factors through RagC 

In cells stimulated with AAs, the Rag GTPases 
recruit mTORCI to the lysosomal surface, where 
it is activated. The Rag proteins function as 
heterodimers, in which the active complex 
consists of GTP-bound RagA or RagB in com- 
plex with guanosine diphosphate (GDP)-bound 
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RagC or RagD (38, 39). Activation of mTORC1 
by intracellular AAs occurs because AAs stim- 
ulate GTP binding to RagA and RagB, promot- 
ing binding to Raptor and assembly of the 
activated mTORC1 complex (40). In the absence 
of AAs, the Rags take up an inactive confor- 
mation (GDP-bound RagA or B and GTP-bound 
RagC or D), causing inactivation and relocaliza- 
tion of mTORC1 to the cytosol. Concordantly, 
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active Rag heterodimers interact with and pro- 
mote recruitment of TFEB to the lysosomes, 
leading to mTORC1-dependent phosphorylation 
and retention of TFEB in the cytosol. Depletion 
or inactivation of Rags prevents recruitment of 
TFEB to lysosomes (5, 7, 9). Furthermore, a 
catalytic arginine in FLCN is required for AA- 
dependent translocation of TFEB and TFE3, con- 
necting control of FLCN-FNIP1 GAP activity to 
AA regulation of RagC (15). To further delineate 
how FNIP1 phosphorylation by AMPK controls 
TFEB, we treated the WT FNIPI and SA5 FNIP1 
cells stably expressing GFP-TFEB with 991 and 
immunoprecipitated GFP-TFEB. Within 10 min, 
interaction of endogenous RagA and especially 
endogenous RagC interaction with GFP-TFEB 
was enhanced and maintained for the dura- 
tion of the treatment (Fig. 3F). Conversely in 
SA5 FNIP1 cells, RagA and RagC interactions 
with TFEB did not change after treatment with 
991 (Fig. 3F). RagB binding to TFEB was minimal 
and did not change between WT and SA5 FNIP1 
cells or with 991 treatment. 

To determine how the FLCN-FNIP1-RagA-C 
complex controls TFEB after AMPK activation, 
we tested whether components of the TFEB 
regulatory machinery changed localization at 
the lysosome in cells exposed to 991. We used 
Lyso-IP, a method for the rapid isolation of 
mammalian lysosomes (41), which uses ex- 
pression of lysosomal transmembrane pro- 
tein 192 fused to three tandem hemagglutinin 
(HA) tags (HA-TMEM192) in WT FNIP1 and 
SA5 FNIP1 HEK293T cells (fig. S4B). As con- 
trols, we also expressed FLAG-TMEM192 in 
WT FNIP1 and SA5 FNIP1 HEK293T cells. All 
cell lines were treated with 991, and HA- 
TMEM192 was immunoprecipitated. The ly- 
sosome surface marker, lysosomal-associated 
membrane protein 1 (Lamp1), was highly en- 
riched in the immunopurified lysosomal frac- 
tion with little remaining in the supernatant, 
which indicates that most of the lysosomes 
had been extracted with the Lyso-IP technique 
(fig. S4B). Organelle markers, such as Golgin97 
for Golgi, were not detected in the immuno- 
precipitates, demonstrating that organelle 
contamination had not occurred in the lyso- 
some purifications. We observed a rapid de- 
crease in association of endogenous RagC and 
RagA with the lysosome in WT FNIP1-HA- 
TMEM192 cells within 20 min of treatment 
with 991 (fig. S4B). By contrast, both RagC and 
RagA remained in the lysosome fractions in 
SA5 FNIP1-HA-TMEM192 cells with or without 
991. FLCN, FNIP1, and mTOR followed a similar 
pattern. Depletion of AAs reduced RagA and 
RagC in lysosome fractions in WT FNIP1 but also 
SA5 FNIP!I cells, reiterating that AMPK-FNIPI- 
mediated control of mTOR and TFEB is not re- 
quired in these conditions and is distinct from 
TFEB regulation by AAs and mTOR (fig. S4C). 

To visualize the changes in RagC and mTOR 
localization after AMPK activation in cells, 
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we used immunofluorescence imaging of en- 
dogenous RagC or mTOR, with antibodies to 
RagC or mTOR. RagC localized with the ly- 
sosomal marker Lamp2 in WT FNIP1 cells in 
fresh medium (Fig. 3, G and H). If AMPK was 
activated with 991, RagC localization with 
Lamp2—measured by Pearson’s correlation— 
was one-third of that in WT FNIP1 cells but 
remained localized with Lamp2 in SA5 FNIP1 
cells with or without 991 (Fig. 3, G and H), con- 
firming our biochemical observations. mTOR 
similarly showed decreased lysosomal localiza- 
tion in WT FNIPI cells treated with 991 but 
not in cells expressing SA5 FNIP!1 (fig. S4, D 
and E). Thus, AMPK activation appears to 
displace RagC and mTOR from lysosomes, 
preventing TFEB phosphorylation (fig. S4F). 
The enhanced interaction of RagC and TFEB 
in cells in which AMPK is activated with 991 
indicates that RagC is not only required for 
recruitment of TFEB to the lysosome but also 
that increased binding to RagC may promote 
TFEB removal from the lysosome, perhaps by 
physically chaperoning TFEB into the cytosol. 

We explored how FNIP1 phosphorylation by 
AMPK triggers the disassembly of the lysosomal 
machinery controlling TFEB. The FLCN-FNIP1 
complex functions as a GAP for RagC and RagD, 
promoting the GDP-bound state of RagC and 
RagD, which is required for mTOR recruit- 
ment to the lysosome (17-73). Thus, phospho- 
rylation of FNIP1 by AMPK might alter the 
GAP activity of the FLCN-FNIP1 complex. To 
test this, we transiently transfected either an 
HA-tagged, GTP-locked mutant of RagC (Q120L) 
or an HA-tagged, GDP-locked RagC mutant 
(S75N) in WT FNIP1 and SA5 FNIP1 or WT 
and AMPK KO cells and treated them with 991 
(Fig. 3I and fig. S4G). If phosphorylation of 
FNIP1 by AMPK blocks the GAP activity of the 
FLCN-FNIP1 complex toward RagC, then GDP- 
loaded RagC should accumulate in cells lack- 
ing AMPK or SA5 cells. One prediction of this 
model is that overexpression of the GTP-locked 
mutant of RagC (Q120L) but not the GDP- 
locked RagC mutant (S75N) should restore TFEB 
and TFE3 dephosphorylation in AMPK KO cells 
and SA5 FNIP1 cells. Overexpression of GTP- 
locked RagC overrode the effect of SA5 FNIP1, 
allowing TFEB and TFE3 to be dephosphoryl- 
ated in SA5 cells with or without 991 (Fig. 31). 
We observed the opposite effect in WT FNIP1 
cells, in which less TFEB dephosphorylation oc- 
curred after treatment with 991, when the GDP- 
locked RagC mutant was overexpressed compared 
with when GTP-locked RagC was present (Fig. 
31). Similar results were observed in the AMPK 
KO cells, where GTP-locked RagC overcame the 
absence of AMPK, enabling dephosphorylation 
of TFEB, whereas GDP-locked RagC prevented 
complete TFEB dephosphorylation in WT cells 
treated with 991 (fig. S4G). 

Thus, AMPK phosphorylation of FNIP1 
appears to inhibit FLCN-FNIP1 GAP activ- 
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ity, driving RagC to accumulate in its inac- 
tive GTP-bound form, which not only falls off 
the lysosome (Fig. 3, G and H, and fig. S4B) 
but also binds more tightly to TFEB (Fig. 3F). 
Immunoprecipitation experiments using cells 
expressing GTP- or GDP-locked RagC mutants, 
under the same conditions as shown in Fig. 
31, showed that in WT FNIP1 cells, more TFEB 
was bound to RagC in the GTP-locked state, 
bypassing the need for AMPK activation, in 
contrast to the GDP-locked state in which 
binding only occurred if cells were treated 
with 991 (presumably because of the presence 
of endogenous RagC, which would also bind 
TFEB in 991-treated cells) (Fig. 3J). Furthermore, 
in SA5 FNIPI cells, exogenous GTP-locked RagC 
was associated with TFEB, overriding the ef- 
fect of the SA5 mutations, whereas GDP-locked 
RagC did not bind TFEB to the same extent, 
regardless of whether cells were treated with 
991 (Fig. 3J). A similar pattern was observed 
in the WT cells compared with cells lacking 
AMPK (fig. S4H). These results further cor- 
roborate and explain the results shown in Fig. 
3F: Increased HA-RagC-TFEB interaction oc- 
curs in cells with active AMPK because of in- 
activation of the FLCN-FNIP1 GAP complex, 
propelling RagC to its GTP-bound state, which 
binds to TFEB more strongly than the GDP- 
bound form of RagC. Although GDP-RagC re- 
cruits TFEB to the lysosome for mTOR-dependent 
phosphorylation, GTP-RagC appears to be re- 
quired to chaperone TFEB off the lysosome, 
preventing its phosphorylation. 


FNIP1 phosphorylation by AMPK is required 
for lysosomal biogenesis 


The MiT-TFE family of transcription factors, 
including TFEB and TFE3, are oncogenes and 
master regulators of lysosome biogenesis and 
autophagy (16, 17). Given that AMPK activa- 
tion leads to nuclear translocation of TFEB 
and TFE3, an increase in their transcriptional 
activity would be expected. Our RNA-seq data 
in WT and AMPK KO HEK293T cells displayed 
an AMPK-dependent lysosomal gene signa- 
ture. To assess the functional role of FNIP1 
phosphorylation by AMPK, we analyzed global 
transcription in WT and SA5 FNIP1 cells treated 
with 991 for 0 to 16 hours by RNA-seq. Un- 
supervised hierarchical clustering of differen- 
tially expressed genes (P < 0.05; FC = 1.3) 
revealed that genes clustered according to 
991 treatment or FNIP1 mutation status (WT 
or SA5) or both, demonstrating that FNIP1 
phosphorylation by AMPK governed ~20% of 
the AMPK-responsive genes, whose transcrip- 
tion increased in cells treated with 991 (fig. S5, 
A and B). Enrichment analysis of differentially 
expressed genes found “clathrin-coated endo- 
cytic vesicle membrane” and “autolysosome” to 
be the most overrepresented terms in the GO 
cellular components category for transcripts less 
abundant in SA5 FNIP1 cells than in WT FNIP1 
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cells treated with 991 (fig. S5, C and D); GSEA 
analysis revealed “KEGG Lysosome” as one of 
the most enriched gene sets in WT FNIP1 
cells treated with 991 for 16 hours compared 
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SA5 conditions. (B) RNA-seq analy- 
sis of WT FNIP1 and SA5 FNIP1 cells 


subjected to a O- to 16-hour 991 
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Gene expression pattern analysis showed that 
~75% of these CLEAR target genes showed 
increased transcription after AMPK activation 
in WT FNIP1 conditions (Fig. 4B and fig. S5, 
E and F). The genes clustered into four main 
groups (Fig. 4B)—one group showed early tran- 
scriptional activation between 2 and 4 hours 
only, the second group showed two waves of 
transcription with one wave at an early time 
point (2 hours) and the next at a later time 
point (16 hours), the third group was activated 
early from 2 hours onward but remained stead- 
ily high up to 16 hours, and the fourth group 
responded mainly at later time points between 
8 and 16 hours. A large proportion of these 
genes, whose expression was increased by 991 
in WT FNIPI cells, did not respond to AMPK 
activation in cells overexpressing SA5 FNIP1 
(~600 genes) (Fig. 4, B and C, and fig. S5C). 

To validate some of the targets from our 
RNA-seq analysis, we subjected another batch 
of WT FNIP1 and SA5 FNIP1 cells to prolonged 
treatment with 991 for 0 to 30 hours and per- 
formed qPCR with primers targeting several 
canonical CLEAR network members. Expres- 
sion patterns were similar to those in the RNA- 
seq data. A rapid increase in expression of 
SESN, Hex A, Neul, Lamp1, FNIP2, and ULK1 
mRNA, ranging from ~1.2- to 6-fold, was de- 
tected within 2 to 16 hours of 991 adminis- 
tration in WT FNIPI cells (Fig. 4, D to I. With 
several lysosomal CLEAR target genes, such as 
Neul, SESN, HEXA, and Lamp1, two separate 
waves of transcription were observed in WT 
FNIP1 cells—one at early time points between 
1 and 2 hours and a second larger wave occur- 
ring at 16 to 24 hours. However, when AMPK 
phosphosites on FNIP1 were mutated, no in- 
creased transcription of the CLEAR genes was 
detected (Fig. 4, D to I). Glucose deprivation 
showed similar effects with increased transcrip- 
tion of the CLEAR network gene GLA in WT 
FNIP1 but not in SA5 FNIPI cells (fig. S5G). 

To demonstrate that the differential expres- 
sion of lysosomal genes between WT and SA5 
FNIP1 cells is regulated by TFEB and TFE3, 
we used CRISPR methods to make HEK293T 
cells lacking both TFEB and TFE3 (fig. S5, 
H and I). A cell line lacking only TFEB had 
minimal changes in gene expression, suggest- 
ing redundancy between TFEB and TFE3 
(11, 43, 44). We treated parental WT and cells 
lacking both TFEB and TFE3 [TFEB-TFE3 dou- 
ble knockout (DKO)] with 991 for 24 hours 
and performed RNA-seq analysis. Specifically 
focusing on our manually curated list of lyso- 
somal CLEAR genes, this analysis revealed 
loss of expression in a subset of AMPK-FNIP1- 
dependent lysosomal genes in the TFEB-TFE3 
DKO condition compared with the control, in- 
cluding GZA and LAMTOR4 (Fig. 4, J and K). 

To further study the effect of AMPK and 
FNIP1 on lysosomal proteins, we examined 
protein expression of lysosomal components 
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including Lamp1, Lamtor 1, and cathepsin B 
(45). After 991 treatment, a strong correlation 
in the amounts of nascent, nonglycosylated 
Lamp! protein reflected the expression levels 
of Lamp1 mRNA, showing two separate waves 
of increased expression for both mRNA and 
protein in WT FNIPI cells (Fig. 4). By contrast, 
no change in abundance of the Lamp! protein 
was detected in the cells overexpressing SA5 
FNIP1, which had low amounts like those in 
WT FNIP cells before AMPK activation; Lamtor1 
and cathepsin B followed similar expression 
patterns to that of Lamp!1 (Fig. 4L). 

We used immunofluorescence imaging to 
analyze lysosomal structures by staining for 
intracellular Lamp2. Lysosomal structures 
were quantitated by measurement of lyso- 
some volume and Lamp2 fluorescence inten- 
sities. The percentage of lysosomal structures 
above the threshold volume of 0.1 um? in WT 
FNIP1 cells treated with 991 displayed the 
same biphasic up-regulation pattern observed 
for Lamp1 mRNA and protein (Fig. 4, G and L 
to N). In 991-treated SA5 FNIPI cells, the per- 
centage of lysosomes varied but did not in- 
crease beyond the starting time point (Fig. 4, 
M and N). We also quantitated the Lamp2 
sum intensity per lysosome, which did not dis- 
play a biphasic pattern but did increase at 16 
and 24 hours of 991 treatment in WT FNIP1 but 
not SA5 FNIP1 cells (Fig. 40). Thus, FNIP1 phos- 
phorylation by AMPK appears to contribute to 
lysosomal biogenesis (Fig. 4P). 


AMPK regulation of PGClo. gene induction 
is governed by FNIP1 phosphorylation 


Phenformin, rotenone, and CCCP all induced 
expression of PGClo, (PPARGCIA) mRNA in WT 
cells but not cells lacking AMPK in our RNA-seq 
datasets. The RNA-seq analysis also showed 991- 
induced transcription of PPARGCIA mRNA in 
WT EFNIPI but not SA5 FNIP1 cells (Fig. 4C). The 
PPARGCIA proximal promoter is a direct target 
of TFEB and TFE3 (46-48). Therefore, FNIP1 
phosphorylation by AMPK, through control of 
TFEB-TFE3, might be one of the long-sought 
mechanisms underpinning the transcriptional 
regulation of mitochondrial biogenesis by AMPK. 

The PPARGCIA gene undergoes extensive 
alternative splicing (49). qPCR assessment of 
total PPARGCIA using primers against exon 2, 
which is found in all PGCla isoforms, showed 
an increase in transcription in WT FNIP1 but 
not SA5 FNIPI cells treated with 991 (Total 
PPARGCIA, Fig. 5A). In humans, PGCla tran- 
scription has been reported from three dis- 
tinct promoters: a proximal promoter located 
just upstream of the canonical exon 1a; a distal 
alternate promoter followed by an alternative 
exon 1b, located ~13.7 kb upstream from exon 
la (50); or a much further upstream promoter 
termed the brain-specific promoter, ~500 kb 
upstream of the canonical proximal promoter 
(1, 52). PPARGCIA contains two potential CLEAR 
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elements to direct binding of TFEB and TFE3 
in the proximal promoter adjacent to the ca- 
nonical exon 1a (46, 48). In addition to distinct 
promoters, alternative splicing between exons 
6 and 7 of the PPARGCIA gene produces a 
transcript encoding the N-terminal isoform 
of PGCla, which contains 267 AAs of classical 
PGClo and three AAs from the splicing insert 
(63) (Fig. 5B). NT-PGCla is a constitutive tran- 
scriptional coactivator because it retains the 
transcription activation and nuclear receptor 
interaction domains of full-length PGCloa but is no 
longer subject to phosphorylation-mediated 
turnover associated with the full-length protein 
(63, 54), thus leading to a truncated and tran- 
scriptionally active form of PGCla with a 
longer protein half-life. 

To further investigate regulation of PPARGCIA 
in 991-treated cells, we examined endogenous 
PGCla protein expression by Western blotting 
(Fig. 5C). We did not detect the canonical, full- 
length ~100-kDa isoform of PGCla in 991-treated 
WT or SA5 FNIPI cells. We did, however, detect 
a time-dependent increase in the expression of 
the much smaller ~35-kDa N-terminal isoform 
of the PGCla protein in WT FNIP1 cells after 
991 administration, under the same conditions 
in which FNIP1 is phosphorylated and TFEB 
and TFE3 are activated (Fig. 5C). These changes 
in the 35-kDa PGCla protein were quantitated 
by densitometry, which showed 8- to 16-fold 
increases in PGCla expression after 991 treat- 
ment in WT FNIP1 cells (Fig. 5D). Abundance 
of the shorter PGCla protein remained rela- 
tively low in SA5 FNIP1 cells (Fig. 5, C and D). 
We detected similar changes using two com- 
mercial antibodies (Santa Cruz and EMD 
Millipore), both of which have been validated 
for detection of NT-PGClo isoforms (53, 55). 

Because PPARGCIA is extensively alterna- 
tively spliced, we sought to further examine 
which PPARGCIA mRNA splice isoforms might 
accumulate in an AMPK-FNIP1-dependent 
manner. Analysis of the specific PPARGCIA 
mRNA splice isoforms in our RNA-seq data in 
WT and AMPK KO cells treated with mito- 
chondrial poisons revealed accumulation of 
the NT-PPARGCIA short isoform (ENST no. 
506055.5) that encodes the 271-AA truncated 
form but not full-length PGClo. isoforms (full- 
length, ENST no. 264867.7) after these stresses 
(fig. S6A). To directly compare with our initial 
observations of total PPARGCIA mRNA accu- 
mulation (targeting common exon 2, present 
in all splice isoforms), we carried out qPCR with 
primers targeting exons 5 and 7a to specifically 
detect expression of the N7-PPARGCIA alter- 
native splice form (Fig. 5A). NT-PPARGCIA 
appeared to be more abundant in WT FNIP1 
cells treated with 991 than total PPARGCIA 
mRNA; no effect was detected in cells express- 
ing SA5 FNIP1 (Fig. 5A). qPCR from CCCP- 
treated WT and AMPK KO cells showed a 
similar pattern to what was observed with 991 
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Fig. 5. FNIP1 phosphorylation by : 
AMPK is critical for induction of the 25 
PGCla- and ERRa-mediated mitochon- 
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dots represent genes significantly down-regulated by deletion of ERRa. (O) Four-way 
Venn diagram showing overlap of gene sets controlled by AMPK-FNIP1, TFEB-TFE3, 
PGCla, and ERRa. (P) qRT-PCR of mitochondrial genes including IDH2, Cox IV, CytoC, 
UCP2, and SOD2 in WT FNIP1 and SA5 FNIP1 HEK293T cells subjected to a O- to 
30-hour 991 time course. All data are shown as means + SEMs. n = 3. *P < 0,05; **P < 
0.01; ***P < 0.001; ****P < 0.0001; unpaired t test. (Q) Model. AMPK phosphorylation 
of FNIP1 after energy stress or 991 facilitates TFEB nuclear entry where it binds to 
CLEAR network gene promoters, including the PPARGCIA promoter, which induces 
expression of the short ~35-kDa transcriptional coactivator NT-PGClo. isoform. In turn 
NT-PGCla transactivates the ERRo. transcription factor for induction of mitochondrial genes. 
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treatment in the WT FNIP1 and SA5 FNIP1 
cells. After treatment with CCCP, the NT- 
PPARGCIA short isoform accumulated in WT 
cells but not cells lacking AMPK (Fig. 5E). 
Similar results were also seen with the NT- 
PPARGCIA short isoform after 991 treatment 
in WT cells but not in cells lacking AMPK (Fig. 
5F). Glucose deprivation also induced abun- 
dance of NT-PPARGCIA mRNA in WT FNIP1 
but not SA5 FNIP1 cells (fig. S6B). Thus, the 
NT-PGCloa isoform is the predominant splice 
isoform of PGCla expressed in HEK293T cells 
in response to mitochondrial poisons, glucose 
starvation, or 991 activation of AMPK. 

To discern whether FNIP1 is a critical link 
between AMPK and PGCla to promote mito- 
chondrial biogenesis, we performed hierar- 
chical clustering and differential expression 
analyses of our RNA-seq datasets. Again, using 
the Mitocarta 3.0 catalog, we specifically exam- 
ined mitochondrial gene expression in WT 
and SA5 cells treated with 991. This revealed a 
subset of ~200 mitochondrial genes whose 
mRNA expression was increased by 991 at time 
points ranging from 2 to 16 hours, although 
most of these transcripts accumulated at the 
longest time point studied (16 hours) (Fig. 5G 
and fig. S6C). Notably, these transcripts did 
not accumulate when AMPK phosphosites in 
FNIP1 were mutated in the SA5 FNIP1 sam- 
ples, even if cells were treated with 991 (Fig. 5, 
G and K). To investigate whether phosphoryl- 
ation of FNIP1 by AMPK triggers mitochon- 
drial biogenesis through PGCloa and to define 
the mitochondrial gene signature regulated by 
AMPK and FNIP1 in a PGCla-dependent man- 
ner, we used small interfering RNA (siRNA) to 
deplete HEK293T cells of PGCla (90 to 95% 
efficiency) (fig. S6D) and performed RNA-seq 
after treating cells with 991. PGCla depletion 
led to decreased expression of 291 mitochon- 
drial genes from Mitocarta 3.0 (fig. S6E); a sub- 
set of ~80 of these genes was dependent on 
AMPK phosphorylation of FNIP1 (Fig. 5, H 
and M). To test whether changes in mitochon- 
drial gene expression were mediated by AMPK- 
FNIP1 control of TFEB, we also analyzed the 
expression of Mitocarta 3.0 genes in our WT 
and TFEB-TFE3 DKO RNA-seq dataset. We 
specifically examined the AMPK- and FNIP1- 
dependent mitochondrial genes and made an 
observation similar to that observed with the 
CLEAR network genes: Expression of ~50% of 
AMPK-FNIPI-dependent mitochondrial genes 
increased by 991 treatment in WT but not TFEB- 
TFE3 DKO samples (Fig. 5, I and L, and fig. S6F). 

We investigated mitochondria gene expres- 
sion in cells expressing WT FNIPI or SA5 FNIP1 
after treatment with 991, validating transcrip- 
tion of several PGCla mitochondrial targets by 
qPCR. mRNA levels of IDH2, Cox IV, Cyto C, 
SOD2, and UCP2 remained low at early time 
points but accumulated between 16 and 24 hours 
after 991 administration (Fig. 5P). Cells deprived 
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of glucose for 6 hours also showed increased 
transcription of reported PGClo target genes 
such as PDHAI in WT FNIPI cells (fig. S6G). In 
cells expressing SA5 FNIPI, transcription of all 
mitochondrial genes tested did not increase 
despite prolonged stimulation of cells with 991 
or glucose deprivation (Fig. 5P and fig. S6G). 
Although many lysosomal genes showed two 
waves of expression, mitochondrial gene ex- 
pression generally occurred at later time points, 
consistent with regulation by PGCla, mRNA of 
which was itself transcribed in sync with the first 
wave of lysosomal biogenesis (model in Fig. 5Q). 


AMPK-FNIP1 induction of mitochondrial 
biogenesis requires ERRa 


Our RNA-seq analysis detected an AMPK- 
dependent increase in transcripts for the nu- 
clear receptors ESRRA and ESSRG, an effect that 
was abolished by deleting AMPK or PPARGCIA 
siRNA knockdown. In WT cells, mRNAs of 
genes encoding nuclear receptors, including 
ESRRA and ESRRG, were increased from 2 to 
16 hours after AMPK activation, and deple- 
tion of PPARGCIA suppressed this effect (fig. 
S7A). PGCla initiates mitochondrial biogenesis 
through interaction with ERRa (30, 56, 57). 
Transcription of the ERRa gene is also in- 
creased by PGCla (29). The requirement of 
PGCla for 991 to promote ESRRA expression 
(fig. S7A) suggests that at least some of the 
AMPK- and FNIP1-dependent mitochondrial 
biogenesis program mediated by PGClo. might 
be through transcriptional coactivation of ERRa. 
Furthermore, transcription factor enrichment 
analysis of the global transcriptome of WT cells 
and those lacking AMPK revealed that ESRRA 
targets were overrepresented upon 991 treatment 
in the WT condition and underrepresented in 
cells lacking AMPK (fig. S2J). To test this, we 
genetically deleted ESRRA by CRISPR (fig. S7C). 
RNA-seq analysis of 991-treated WT cells or cells 
lacking ERRa showed that many Mitocarta 3.0 
genes required ERRa for basal expression or in- 
creased expression in response to 991 (fig. S7D). 
We focused on the AMPK- and FNIP1-dependent 
Mitocarta genes to assess which of these were 
up-regulated by 991 in WT controls but sup- 
pressed when ERRa was absent (Fig. 5, J and 
N). We validated the expression of several of 
these genes by qPCR, including CPTIA, COXG6A1, 
IDH2, and PDHAI (fig. S7F), and observed in- 
creased transcription in cells treated with 991, 
with maximal increase at later times of 16 to 
24 hours. 991 did not induce transcription of 
these genes in cells lacking ERRo. 

We next performed a four-way comparison of 
the 991-regulated Mitocarta genes that varied 
between (i) WT and SA5 FNIP1 cells, (ii) WT 
and TFEB-TFE3 DKO cells, (iii) control and 
PGCla siRNA cells, and (iv) WT and cells lack- 
ing ERRa (Fig. 50). We observed an overlap 
between all groups, with 131 out of 190 AMPK- 
and FNIP1-dependent genes being coregulated 
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by one or multiple of the other factors. Twenty 
Mitocarta genes were regulated by all four 
factors, and 61 genes were regulated by three of 
the four conditions (in an SA5-FNIP1-dependent 
manner), which we propose as a core minimal 
AMPK-FNIP1-TFEB-PGCla-ERRo-dependent 
gene set involved in mitochondrial biogenesis 
(Fig. 50 and fig. S7E). 


AMPK-FNIP1 is required for mitochondrial and 
lysosomal biogenesis 


We performed immunoblotting of mitochon- 
drial proteins as a measure of mitochondrial 
biogenesis. We detected increased expression 
of key mitochondrial proteins, including iso- 
citrate dehydrogenase type 2 (IDH2), ubiquinol- 
cytochrome C reductase binding protein (UQCRB), 
and NADH:ubiquinone oxidoreductase sub- 
unit B3 (NDUFB3), after 991 treatment in WT 
FNIP1 cells but not cells expressing SA5 FNIP1 
(Fig. 6A); these were all detected in our RNA- 
seq data showing similar expression patterns. 
In agreement, expression of mitochondrial pro- 
teins, such as UQCRB, PDHAI, and NDUFB3, 
was increased by 991 treatment of WT but 
not TFEB-TFE3 DKO cells, indicating that these 
proteins also require TFEB and TFE3 (Fig. 6B). 
Finally, we also immunoblotted for these mito- 
chondrial proteins after siRNA knockdown of 
PGClo or CRISPR deletion of ERRa, observing 
that PGCla (fig. S7B) and ERRa (Fig. 6C) were 
both required for these mitochondrial pro- 
teins to accumulate in 991-treated cells. 

Although most of the genes required for 
mitochondrial biogenesis are encoded by the 
nucleus, the mitochondrial genome encodes 
13 proteins, which are components of the 
OXPHOS system. As another parameter to 
measure mitochondrial biogenesis, we quan- 
titated relative mitochondrial DNA (mtDNA) 
copy number by qPCR (58) (i.e., the ratio be- 
tween a mitochondrial gene to a reference nu- 
clear gene) in our cell lines at the 24-hour time 
point when mitochondrial gene transcrip- 
tion and protein expression was maximal. In 
cells treated with 991 for 24 hours, we observed 
an ~30% increase in mtDNA abundance in 
WT FNIPI cells but not in SA5 FNIP!1 cells 
(Fig. 6D). 

To visualize mitochondria in WT and SA5 
FNIP1 cells, we performed Airyscan micros- 
copy after staining cells with IDH2 or CoxIV 
antibodies. Exposure of cells to 991 for 24 hours 
increased endogenous IDH2 staining in mitochon- 
dria in WT but not SA5 FNIP1 cells (Fig. 6, E 
and F). Again using Airyscan microscopy, we 
observed an increase in both mitochondrial 
(CoxIV) and lysosomal (LAMP2) volumes after 
24 hours of 991 treatment in cells expressing 
WT but not SA5 FNIP!1 (Fig. 6, G to I, and fig. 
S8A). We also observed an apparent increase 
in mitochondria-lysosome colocalization in 
991-treated WT FNIP1 cells (Fig. 6G and fig. S8B), 
which we further examined using serial section 
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Fig. 6. AMPK-FNIP1-mediated mito- 
chondrial biogenesis affects mito- 
chondrial function and behavior. 

(A to C) Western blots probing mito- 
chondrial protein expression after a 991 
time course ranging from 0 to 30 hours 
in WT FNIP1 and SA5 FNIPL HEK293T 
cells (A), WT and TFEB-TFE3 DKO 
HEK293T cells (B), and WT and ERRo. KO 
HEK293T cells (C). (D) Mitochondrial 

DNA content analysis. The ratio of 
mitochondrial (16S) to nuclear (actin) 

DNA was determined by qRT-PCR after 
treatment for 24 hours with 991 or DMSO 
(vehicle), as indicated. (E) Quantitation 
of IDH2 staining in (F). (F) Representative 
Airyscan microscopy images of mito- 
chondrial IDH2 staining in WT FNIP1 and 
SA5 FNIP1 HEK293T cells treated for 

24 hours with 991 or DMSO, as indicated. 
(G) Representative Airyscan images of 
Lamp2-stained lysosomes and Cox IV- 
stained mitochondria in WT FNIP1 and 
SA5 FNIP1 HEK293T cells treated for 
24 hours with DMSO or 991, as indicated. 
(H) Quantitation of mitochondrial volume 
in (G). (I) Quantitation of lysosomal 
volume in (G). (J) Seahorse assay to 
measure OCR in WT compared with 
AMPK KO HEK293T cells. (K) Seahorse 
assays displaying OCR in WT FNIP1 
compared with SA5 FNIP1 HEK293T cells. 
(L) Seahorse assays measuring OCR 

in WT compared with ERRa. KO HEK293T 
cells. Graphs are shown as the means + 
SEMs. n = 3. *P < 0.05; **P < 0.01; 
***P < 0.001, ****P < 0.0001; unpaired 
t test. 
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EM. In EM data, an increase in mitochon- 
dria to lysosome contacts was observed (as 
defined by <20-nm distances between mito- 
chondrial outer membrane and lysosome) only 
in the WT 991-treated condition (fig. S8, C to E). 

Finally, we conducted Seahorse experiments 
to measure oxygen consumption rate (OCR) 
as a measure of mitochondrial function in 
WT cells or cells lacking AMPK (Fig. 6J), WT 
FNIP1 and SA5 FNIP!1 cells (Fig. 6K), and WT 
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and cells lacking ERRa (Fig. 6L). OCR was de- 
creased in cells lacking AMPK compared with 
that in WT cells. Concordantly, OCR was also 
reduced in the SA5 FNIP!1 cell line compared 
with WT FNIPI cells. Moreover, a comparison 
of WT cells with cells lacking ERRo revealed 
that OCR was also decreased when ERRa was 
deleted. This further supports the common 
role of AMPK, FNIP1, and ERRa in controlling 
mitochondrial biogenesis and function. 
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Discussion 

FNIPI1 was originally identified as a binding 
partner for the FLCN hamartoma suppressor 
(34) and in the same study was found to co- 
immunoprecipitate with endogenous AMPK 
subunits, although a functional role for FNIP1 
mediating aspects of AMPK function has never 
been examined in the decades since. Subse- 
quent studies have focused on how AMPK sig- 
naling is hyperactivated in FLCN-deficient 
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states (59-62). Our findings indicate that the 
physiological relationship between AMPK and 
FLCN-FNIP1 is that after metabolic stress, AMPK 
lies upstream of FLCN-FNIP1, wherein AMPK- 
dependent phosphorylation of FNIP1 acutely 
inhibits FLCN-FNIP1 GAP function, leading to 
TFEB activation. 

Our results are consistent with the emerg- 
ing model that the FNIP1-FLCN GAP complex 
controls RagC to retain TFEB and TFE3 at the 
lysosome and that TFEB and TFE3 are specific 
and selective substrates of mTORC1 in this 
regulation (15, 36, 37). A critical arginine re- 
quired for GAP activity of FLCN in the FLCN- 
FNIP1 complex is required for AA-induced 
control of translocation of TFEB and TFE3 
(5). Our data indicate that AMPK activation 
inhibits FLCN-FNIP1 GAP activity and pro- 
motes the accumulation of GTP-loaded RagC, 
the inactive state of the Rag complex. This 
mechanism allows activation of TFEB and TFE3 
under low-energy conditions, even if AAs are 
plentiful. AMPK phosphorylation of FNIP1 pro- 
motes dissociation of RagC and mTOR from the 
lysosome as well as separation of mTOR from 
TFEB itself, consistent with loss of mTORC1- 
mediated TFEB phosphorylation. In cells that 
are AMPK deficient or just mutated in the five 
AMPK sites in FNIP1, even in the face of en- 
ergy stress, the FNIPI-FLCN complex cannot 
be regulated by AMPK, and mTOR remains 
bound to TFEB at the lysosomal surface, ren- 
dering TFEB resistant to activation by mito- 
chondrial poisons, glucose deprivation, or direct 
AMPK activators. By contrast, these cells still 
exhibit normal regulation of TFEB in response to 
AA withdrawal. Thus, AMPK maintains specific 
control of TFEB and TFE3 through its phos- 
phorylation of FNIPI, independent of AA reg- 
ulation of the FLCN-FNIP1 complex. Collectively, 
ENIP1 phosphorylation by AMPK at Ser””°, Ser”°, 
Ser”””, Ser’®!, and Ser®®? is a key mechanism by 
which AMPK controls the MiT-TFE family of 
transcription factors to increase lysosomal bio- 
genesis and in parallel to increase PGCla mRNA, 
which induces mitochondrial biogenesis, placing 
FNIP!1 at the center of multiple AMPK-dependent 
processes for which the direct biochemical sub- 
strate of AMPK had remained elusive. 


Materials and methods 
Antibodies and reagents 


Abcam antibodies used include Total OXPHOS 
Rodent WB Antibody Cocktail [catalog no. 
(Cat#) ab110413], monoclonal anti-UQCRB (Cat# 
ab190360 [EPR15591]), monoclonal anti-NDUFB3 
(Cat# ab202585 [EPR15571]), monoclonal Tomm20 
(Cat# ab56783), and monoclonal LAMP2 (Cat# 
ab25631). Cell Signaling Technology (CST) anti- 
bodies used in the study were as follows: TFEB 
(Cat# 4240), monoclonal TFEB (D207D, Cat# 
37785), monoclonal Phospho-TFEB $122 (Cat# 
86843), Tfe3 (Cat# 14779), FNIP1 (Cat# 36892), 
Phospho-FNIP1 S220 (Cat# 40812, in develop- 
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ment), FLCN (Cat# 3697 (D14G9), RagA (D8B5, 
Cat# 4357), RagB (D18F3 Cat# 8150), RagC 
(D8H5, Cat# 9480), IDH2 (DSE3B, Cat# 56439), 
PDHAI (C54G1, Cat# 3205), Tricarboxylic Acid 
Cycle Antibody Sampler Kit (Cat# 47767), mono- 
clonal anti-LAMP2 (H4.B4, Abcam Cat# ab25631), 
monoclonal anti-LAMP1 (D2D11, Cat# 9091), 
ERRa (Cat# 13826), AMPKo. (Cat# 2532), Phospho 
AMPKo, T172 (40H9, Cat# 2535), ACC (Cat# 3662), 
Phospho-ACC S79 (Cat# 3661), Raptor (Cat# 
2280), Phospho-Raptor S792 (Cat# 2083), anti- 
Ulk1 (D8H5 Cat# 8054), Phospho-S6K T389 
(Cat# 9205), 4EBP1(Cat# 9452), S6 (5G10, Cat# 
2217), Phospho-S6 S235/236 (Cat# 4858), mTOR 
(Cat# 2972), LAMTOR1/Cllorf59 (DIIH6, Cat# 
8975), monoclonal Hdac3 (7G6C5 Cat# 3949), 
Golgin-97 (D8P2K Cat# 13192), monocloncal 
GAPDH (DI6H11, Cat# 8884), GFP (D5.1, Cat# 
2956), and DYKDDDDK (FLAG) tag (Cat# 2368). 
A-Tubulin antibody was from Sigma-Aldrich 
(B-5-1-2, Cat# T5168). Monoclonal anti-PGCla 
antibodies were from Santa Cruz Biotechnol- 
ogy (D-5, Cat# sc-518025) and Millipore (41.3, 
Cat# ST1202). Polyclonal Tfeb antibody was 
from Bethyl Laboratories (Cat# A303-673A). 
GFP-Trap Agarose was from ChromoTek (Cat# 
gta-20) and Pierce HA Magnetic Beads were 
from Thermo Fisher Scientific (Cat# 88837). 991 
was purchased from Glixx Laboratories (Cat# 
GLXC-09267), and phenformin hydrochloride 
(Cat# P7045), rotenone (Cat# R8875), CCCP 
(Cat# C2759), and metformin (Cat# PHR1084) 
were from Sigma-Aldrich. 


Plasmids 


The cDNA encoding human FNIP1 (Uniprot 
Q8TF40) was generated by reverse transcrip- 
tion polymerase chain reaction (RT-PCR) from 
RNA obtained from IMR90 cells. Fusion tag 
FLAG-CHERRY was added to the N-terminal 
end of FNIP1 and subcloned into pDONR221 with 
BP Clonase (Invitrogen). The cDNA for human 
FLCN was obtained from Invitrogen (IOH12359). 
Mammalian expression vectors were generated, 
by recombining ENTR clones into DEST vec- 
tors using LR Clonase (Invitrogen). Destination 
vectors include: pLentiCMV/TO (Addgene no. 
17293), pcDNA3 N-term FLAG DEST, pcDNA3 
N-Term MYC DEST, and pQCXIB CMV/TO 
(Addgene no. 17400). Site-directed mutagenesis 
for FNIP1 was performed using QuikChange IT 
XL (Stratagene) according to the manufacturer’s 
instructions. Untagged human AMPKol cDNA 
was cloned into pLentiCMV/TO puro Gateway 
destination vector. Other plasmids used include 
pLJC5-Tmem192-2xFlag (42) (Addgene no. 102929), 
pLJC5-Tmem192-3xHA (47) (Addgene no. 102930), 
PEGFP-NI-TFEB (5) (Addgene no. 38119), pRK5- 
HA GST RagC 120L (Addgene no. 19306), and 
pPRK5-HA GST RagC 75L (Addgene no. 19305). 


Cell culture and cell lines 


All cells were cultured in Dulbecco’s modified 
essential medium (DMEM) supplemented with 
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10% (v/v) fetal bovine serum (FBS) (Hyclone, 
Thermo Fisher Scientific), 2 mM 1-glutamine, 
penicillin/streptomycin (Gibco) at 37°C in 5% 
CO, and maintained under antibiotic selection 
for stable cell lines. Stably reexpressing WT 
FNIP1 and SA5 FNIP1 cells were generated 
by using lentivirus-mediated transduction of 
FNIP1 KO HEK298T cells with human Myc- 
tagged FLCN cDNA, in combination with either 
FLAG-Cherry-tagged WT FNIP1, SA4 FNIP1, 
or SA5 FNIP1 cDNA under double puromycin 
(Sigma) and hygromycin (Invitrogen) selec- 
tion. GFP-TFEB cell lines were generated by 
stable infection of WT FNIP1 and SA5 FNIP1 
HEK293T cells with lentivirus expressing GFP- 
TFEB cDNA and blasticidin resistance. Lyso-IP 
cells were generated by stable infection of WT 
FNIP1 or SA5 FNIP1 with lentivirus expressing 
pLJC5-Tmem192-3xHA or pLJC5-Tmem192- 
2xFlag cDNA. Mitochondrial poisons and AMPK- 
activating drugs were used at the following 
concentrations: 991 (50 uM), CCCP (5 uM), 
phenformin (2.5 mM), and rotenone (1 mM). 
For AA starvation experiments, cells were first 
washed with RPMI AA-free medium supple- 
mented with 10% dialysed FBS, then the same 
media added for the times indicated in figures. 
For AA-replete conditions, RPMI AA-free me- 
dium was supplemented with 10% dialysed FBS 
in addition to 1x L-glutamine, 1x essential AAs, 
and 1x nonessential AAs for the times indicated 
in the figures. For glucose deprivation experiments, 
glucose-free DMEM media (Invitrogen) was sup- 
plemented with 10% dialysed FBS and a range 
of glucose concentrations including 1 mM, 2.5 mM, 
and 25 mM. For transient expression of proteins 
and packaging of virus, HEK293T cells were 
transfected with the plasmid of interest using 
Lipofectamine 2000 (Invitrogen, Carlsbad, CA) 
following the manufacturer’s protocol. 


Mouse studies 


All procedures using animals were approved 
by the Salk Institute Institutional Animal Care 
and Use Committee (IACUC) protocol 11-000029. 
AMPKal and AMPKa?2 floxed allele (Prkaar", 
Prkaa2 Ay mice bearing Albumin-creERT2 were 
treated with tamoxifen (1 mg per day) or ve- 
hicle (control) for 5 consecutive days. Eight weeks 
after tamoxifen injection, mice were fasted over- 
night, refed for 1 hour, then for 2 hours with ve- 
hicle or MK-8722 (30 mg/kg) before euthanizing 
as previously described (63). Livers were collected, 
and lysates were prepared in lysis buffer. In Fig. 
2F, primary hepatocytes were made and treated 
with metformin as previously described (64). 


Western blots 


For biochemical analysis of cells, cells were 
washed with ice-cold phosphate-buffered saline 
(PBS) and lysed in buffer containing 20 mM Tris 
pH 7.5, 150 mM NaCl, 1 mM EDTA, 1 mM EGTA, 
1% Triton X-100, 2.5 mM pyrophosphate, 50 mM 
NaF, 5 mM £-glycero-phosphate, 50 nM calyculin 
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A,1mM NagVO,, and protease inhibitors (Roche). 
Lysates were clarified by centrifugation at 
16,000 xg for 10 min at 4°C. Protein concen- 
tration was calculated using the BCA protein 
kit (Pierce). Lysates were resolved on 10 to 
12% SDS-polyacrylamide gel electrophoresis 
(SDS-PAGE) gels, depending on molecular 
weight of proteins assessed and immunoblotted. 
Nuclear and cytoplasmic fractions were isolated 
using a NE-PER nuclear and cytoplasmic ex- 
traction kit (Thermo Fisher Scientific) accord- 
ing to manufacturer’s instructions. 


Immunoprecipitation 


Cells were lysed in standard lysis buffer as de- 
scribed above and samples equilibrated. GFP- 
TFEB was immunoprecipitated from 2 mg of 
lysates using GFP-Trap beads (Chromotek) at 
4°C for 2 hours under rotation. Subsequently, 
the beads were washed three times with lysis 
buffer. Protein complexes were eluted from 
beads using Laemmli sample buffer with 2% 
(v/v) beta-mercaptoethanol. Coimmunopre- 
cipitating proteins were detected by immuno- 
blot analysis. 


Immunopurification of lysosomes (Lyso-IPs) 


Lyso-IP cells were generated by infection with 
viruses containing Tmem192-3xHA (HA-Lyso) 
or Tmem192-2xFlag (Cont-Lyso). The Lyso-IP 
protocol was carried out as described in Abu- 
Remaileh et al. (41). Briefly, WI FNIP1/HA- 
Lyso or SA5 FNIP1/HA-Lyso cells in addition 
to control cells were treated with short 991 time 
courses and washed with PBS then scraped into 
1-ml KPBS (136 mM KCl, 10 mM KH2P04, 
pH 7.25 adjusted with KOH). Cells were gently 
homogenized with 20 strokes of a 2-ml ho- 
mogenizer. The homogenate was then centrifuged 
at 1000 xg for 2 min and the supernatant con- 
taining the cellular organelles including lyso- 
somes was incubated with 100 1] beads of anti-HA 
magnetic beads, prewashed in KPBS, on a gentle 
rotator for 15 min at 4°C. Immunoprecipitates 
were then gently washed three times with KPBS 
and the lysosome fraction was eluted from 
the beads through incubation in lysis buffer 
for 10 min. 


CRISPR-Cas9 techniques 


Small guide RNAs (sgRNAs) targeting human 
TFEB, TFE3, and ERRo were selected using 
the CRISPR design tool Benchling program 
hittps://www.benchling.com/crispr/. Guides with 
high targeting scores and low probability of 
off-target effects were chosen (table S1). At least 
three independent sgRNA sequences were tested 
for each gene. Oligonucleotides for sgRNAs were 
synthesized by IDT, annealed in vitro, and sub- 
cloned into BsmbI-digested plentiCRISPRv.2-puro 
(Addgene no. 52961) or lentiCRISPRv.2-blast 
(Addgene no. 98293). Validation of guide spec- 
ificity was assessed by Western blotting of low- 
passage cells after puromycin selection. 
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HEK293T AMPK KO cells were generated 
using the Cas9 nickase strategy. Briefly, a pair 
of guide RNAs (gRNAs) targeting exon 1 was 
designed for both human PRKAA/ and PRKAA2 
genes using the online design tool at http:// 
crispr.mit.edu (AMPKa1 A/B and AMPKa2 A/B, 
respectively) (table S1). Each gRNA duplex was 
cloned into pX462 vector encoding SpCas9n-2A- 
Puro (Addgene no. 48141). HEK293T cells were 
transfected with the gRNA pair to generate 
single AMPKal or AMPKa?2 KO or transfected 
with both pairs together to generate double 
AMPK «1/02 KO (DKO). After puromycin se- 
lection, single-cell cloning was performed by cell 
sorting into 96-well plates. Individual clones were 
screened by Western blot and a clone lacking 
both AMPK a1 and a2 protein expression was 
selected. HEK293T FNIP1 KO cells were made 
using a single gRNA (table S1) targeting exon 2 
and subsequently cloned into pX459 encoding 
SpCas9(BB)-2A-Puro (Addgene no. 48139). Single- 
cell clones were isolated after selection with 
puromycin and screened by immunoblotting. 


RNA interference studies 


All siRNAs were purchased from Horizon Dis- 
covery. siRNAs against a nontargeting sequence 
was used as a negative control (ON-TARGET- 
plus nontargeting control siRNA, Cat# D-001810- 
01-05) or ON-TARGETplus SMARTpool siRNAs 
targeting PPARGCIA (L-005111-00-0005) or tar- 
geting TFEB (Cat# L-009798-00-00050) or TFE3 
(Cat# L-009363-00-0005) were used. HEK293T 
cells were plated in a 6-well plate and allowed 
to adhere overnight. Cells were transfected with 
the nontargeting control siRNA pool (20 nM), or 
PPARGCIA siRNA pool (20 nM). Transfection 
was carried out using Lipofectamine RNAiMAX 
(Invitrogen) according to the manufacturer’s 
protocol. 48 hours after transfection, cells were 
treated with 991 as indicated in the figures and 
then harvested for downstream applications. 


Lentivirus production 


Lentiviruses were produced by transfecting 
HEK293T cells with the cDNA construct of 
interest including 3xHA-Tmem192, 2xFLAG- 
Tmem192, FLAG-WT FNIP1, FLAG-SA4 FNIP1, 
FLAG-SA5 FNIP1, MYC-FLCN, or GFP-TFEB 
constructs, in combination with VSV-G and 
CMV-AVPR packaging plasmids. Sixteen hours 
later, the media was changed to DMEM with 
10% FBS. The virus containing supernatant was 
collected the next day and 0.45-um filtered, 
then frozen at —80°C or supplemented with 
8 ug/ml polybrene and applied to destina- 
tion cells for 24 hours. Sixteen hours later, 
the media was refreshed and the appropri- 
ate antibiotic was added for selection. 


In vitro kinase assays 
HEK293T cells transiently transfected with 


FLAG-FNIP1 or FLAG-tagged FNIP1 mutants 
(SA2, SA3, SA4, SA5) were lysed and subjected 
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to FLAG immunoprecipitation. Immunoprecip- 
itates were washed three times in lysis buffer, 
followed by three times in kinases assay buffer 
(50 mM Tris pH 7.5 10 mM MgCl2). Subse- 
quently, immunoprecipitates were subjected 
to akinase reaction containing 0.1 mM [y32P]- 
ATP (PerkinElmer) and 2 mM dithiothreitol 
(DTT) with or without 0.1 pg of active recom- 
binant 50 ng of active recombinant AMPK 
(Millipore no. 14-840) in the presence of ki- 
nase assay buffer. The reaction was incubated 
at 30°C for 30 min. Reactions were terminated 
with LDS sample buffer and resolved by SDS- 
PAGE electrophoresis. Proteins were detected 
with Coomassie staining. Dried gels were exposed 
to UltraCruz autoradiography film overnight, 
in an autoradiography cassette and the films 
were later developed using an auto-developer. 


Immunofluorescence 


Cells were seeded on coverslips (precoated with 
poly-L-lysine). Following treatments described 
in figure legends, cells were fixed with 4% (v/v) 
paraformaldehyde and permeabilized with 
1% (v/v) NP-40. Cells were blocked using 5% 
bovine serum albumin (BSA) in PBS, then in- 
cubated for 1 hour with primary antibodies, 
followed by three washes in 0.2% BSA/PBS. 
Coverslips were incubated for 1 hour with 
secondary antibodies and counterstained with 
4',6-diamidino-2-phenylindole (DAPI) for 5 min, 
then washed three times in 0.2% BSA/PBS and 
once in water. Coverslips were mounted with 
Fluoromount-G (Southern Biotech). 


Confocal imaging of TFEB, TFE3, and RagC 


For visualization of TFEB, TFE3, and RagC, the 
following primary rabbit antibodies were used: 
TFEB (CST no. 37785), TFE3 (CST no.14779), and 
RagC (CST no. 9480) followed by anti-rabbit 
secondary antibodies conjugated to Alexa Fluor 
(AF) 594. Cells were imaged on a Zeiss LSM 700 
confocal microscope using the 63X objective. 
Quantification was performed using imageJ 
software. 


Airyscan confocal imaging of IDH2, Cox IV, 
and Lamp2 


For imaging IDH2, coverslips were incubated 
in anti-IDH2 (CST no. 56439) antibody, followed 
by secondary anti-rabbit antibody conjugated 
to AF-568. IDH2 was costained with phalloidin 
647 and Tomm20 using anti-Tomm20 (Abcam 
no. ab56783) antibody, followed by secondary 
antibody conjugated to AF-488. Cox IV and 
Lamp2 were costained with anti-Cox IV (CST 
no. 11967) and anti-Lamp2 (Abcam no. ab18528) 
antibodies, followed by anti-rabbit secondary 
conjugated to AF-594 and anti-mouse second- 
ary conjugated to AF-488, respectively. Cells 
were imaged with a 63x 1.4 NA oil objective on 
a ZEISS 880 LSM Airyscan confocal system with 
an inverted stage. High-resolution Airyscan 
images were acquired using a pixel dwell time 
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of 0.66 ps and 2x Nyquist pixel size of 43 nm 
per pixel in SR mode (i.e., a virtual pinhole size 
of 0.2 Airy units), then processed using ZEISS 
Zen software with the Airyscan parameter 
determined by auto-filter settings. The zoom 
factor was set to 2 to obtain a large field of 
view. Alexa Fluor 488 was imaged with a 
488-nm laser with a laser power of ~64 WW, 
Alexa Fluor 568 and 594 were imaged with the 
561-nm laser with a laser power of ~268 .W, and 
Alexa Fluor 647 was imaged with the 633-nm 
laser with a laser power of ~98 iW. 


Fluorescence image quantification 


IDH2 fluorescence intensity quantification 
was performed with Imaris (version 9.6.0, 
Bitplane, Zurich, Switzerland). Mitochondrion 
regions were masked by surfaces generated 
TOM20 fluorescence using automatic thresh- 
old settings in Imaris. Mitochondrial associ- 
ated IDH2 was quantified by summing up the 
IDH2 voxel intensity values inside the seg- 
mented TOM20 regions and normalizing to 
the mitochondria volume. Statistical signif- 
icance was determined using an unpaired t 
test. Quantification was performed with Ima- 
ris software 9.6.0 (Bitplane). COX IV fluores- 
cence image channels were preprocessed with 
a Gaussian smoothing filter with a filter size 
of 0.0707 um. Background subtraction was 
performed with an estimation of the diam- 
eter of the largest sphere that fits into the 
object as 0.265 um. Mitochondrial clusters 
were then defined by surfaces generated with 
automatic threshold settings, which makes 
use of an iterative selection method. Clusters 
smaller than 0.003 um? were removed. Clus- 
ters for LAMP 2 fluorescence signal were 
defined similarly but without smoothing or 
background subtraction. Touching objects 
were separated using region growing with a 
seed point diameter of 0.25 um. The Imaris 
quality filter with a lower threshold of 86.5 
was then used to select positive signals. Clus- 
ters smaller than 0.065 1m” were removed. Cell 
volume was defined by first smoothing the 
LAMP2 signal with a Gaussian filter with a di- 
ameter of 1 1m. Surfaces generated with manual 
threshold of 41.5 absolute intensity. Lysosomes 
colocalized with mitochondria were defined as 
lysosomes located within 0.2 1m from its closest 
mitochondria cluster. The lysosome colocalized 
volume, total lysosome volume, and mitochon- 
dria volume were normalized to the cell vol- 
ume. Statistical significance was determined 
using an unpaired ¢ test. 


Electron microscopy imaging 
and quantification 


Cells were cultured on 10-cm dishes to reach 
~70% confluence before fixation. Materials 
were sourced from Electron Microscopy Sci- 
ences (Hatfield, PA) unless noted otherwise. 
Culture media was gently poured off and 4 ml 
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of warm 37°C fixative (8% glutaraldehyde in 
0.1 M sodium cacodylate buffer with 3 mM 
CaCl2) was added to the dish before being re- 
placed with ice-cold fixative after 10 s. After 
an hour of fixation at 4°C, cells were rinsed 
with 0.1 M sodium cacodylate buffer with 3 mM 
CaCl2 three times for 10 min and postfixed with 
reduced osmium tetroxide (0.1 M sodium 
cacodylate buffer, 0.1 M CaCl2, 1.5% osmium 
tetroxide, 1.5% potassium ferrocyanide) for 
40 min in the dark at room temperature. 
Dishes were rinsed with ice-cold deionized wa- 
ter three times, with the final rinse of 1 ml of 
water left in the dish, and scraped with a 
sharpened piece of Teflon secured in a hemo- 
stat. The cell suspension was collected into 
Eppendorf tubes and left overnight at 4°C. 
The following day, cells were stained with 
aqueous 1% uranyl acetate at room temper- 
ature for an hour before serial dehydration in 
ice-cold ethanol solutions. Dehydrated cells 
were rinsed twice with anhydrous ethanol at 
room temperature and infiltrated with Eponate 
12 resin (hard formulation) for 2 hours at 3:1 
and 1:1 ethanol:resin mixtures and left in 1:3 
resin to infiltrate overnight. The following day, 
cells were infiltrated in two changes of pure 
resin throughout the day and pelleted in a 
third change of fresh resin at 12,000 rpm in 
a tabletop centrifuge (Pelco) and left to poly- 
merize in an oven at 60°C for 48 hours. 
Polymerized blocks were removed from 
Eppendorf tubes using razor blades and 
trimmed for utlrathin serial sectioning as 
previously described (65) using diamond knives 
(Diatome) mounted on a Zeiss UC8 ultra- 
microtome. Series of ultrathin (70-nm) sec- 
tions were collected onto silicon chips and 
imaged using a scanning electron microscope 
(Zeiss Sigma VP) equipped with a backscat- 
tered electron detector (Gatan) using Atlas5 
(Fibics) control software and scan generator. 
Ribbons of serial sections from each condi- 
tion were screened to identify a cluster of cells 
suitable for further analysis. For each series, 
the region of interest in each section was iden- 
tified and captured at an overview resolution 
(200 nm per pixel) and a midlevel resolution 
(8 nm per pixel) to minimize drift during 
high-resolution (2 nm per pixel) acquisition. 
High-resolution stacks of images were aligned. 
Approximately 50 mitochondria from two 
randomly sampled subvolumes of registered 
3DEM image stacks were segmented using Vast 
Lite (66). The two volumes (total 105.28 um?, 
average = 52.64 um’) of densely labeled 
mitochondria (n = 53, average = 26.5) from 
the WT-DMSO condition were proofread. Be- 
cause of the prevalence of touching mitochon- 
dria, all labels were first eroded to ensure 
separated boundaries. A three-dimensional 
(3D) U-Net was used to detect boundaries and 
LSDs in a multitask learning framework. The 
network had an input shape of [48,284,284] 
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and output shape of [16,196,196] (voxels, zyx). 
It consisted of three layers and was down- 
sampled by a factor of [1,2,2] in the first two 
layers and [2,2,2] in the final layer. The re- 
verse was done for the up-sampling path. 
Twelve initial feature maps were used and 
multiplied by a factor of 5 between layers. 
The number of feature maps in the last layer 
was increased to 14 to account for the 13 fea- 
ture maps generated by the boundaries (3) 
and LSDs (10). A mean squared error loss and 
Adam optimizer were used to train the net- 
work. A single voxel neighborhood [1,1,1] was 
used for the boundaries. The LSDs used a sigma 
of 140 and were down-sampled by a factor 
of 2. Training was done using Tensorflow and 
Gunpowder. Inference, seeded watershed, and 
percentile agglomeration were performed in 
a block-wise fashion for each condition. Four 
subvolumes from the non-WT-DMSO conditions 
(1 WT-991, 1 SA5-DMSO, 2 SA5-991, 116.73 um’, 
average = 29.2 m®) were proofread (n = 110, 
average = 27.5) and used to refine training 
along with the original two volumes. Addi- 
tionally, 11 negative samples which did not 
contain any mitochondria (i.e., resin, cyto- 
sol, background) were used for retraining 
(~450 um?, average = 40.9 :m?). Each batch 
randomly sampled from one of the 17 total 
subvolumes such that the probability of choos- 
ing either a positive or negative sample was 
50%. After training, segmentations were cre- 
ated for each condition, and small objects 
(i.e., oversegmented debris) were filtered out. 
Mitochondria were then randomly sampled 
from each volume (7 = 100) and proofread for 
subsequent analysis. Iterative cycles of train- 
ing and manual proofreading were used until 
a fairly accurate but excessively permissive 
segmentation of mitochondria was achieved 
for all conditions. One hundred mitochondria 
were randomly sampled from the machine- 
generated segments. Segmentations were me- 
ticulously proofread by human experts before 
inclusion in analysis. Contingency tables of 
sampled mitochondria with and without 
lysosomal contacts were assembled for pairs 
of experimental conditions (i.e., combinations 
of WT/SA5, DMSO/991 treatment). Chi square 
tests were performed in Python using the 
chi2_contingency function in the scipy.stats 
library. 


Mass spectrometry 


HEK293T cells were transfected with an epitope- 
tagged FNIP1 cDNA, and cells were treated 
with DMSO or phenformin. After immunopu- 
rification of FNIP1 protein, FNIP1 was isolated 
from an SDS-polyacrylamide gel. Bands on the 
gels were cut out and subjected to reduction 
with dithiothreitol, alkylation with iodoaceta- 
mide, and in-gel digestion with chymotrypsin 
overnight at pH 8.3, followed by reversed-phase 
microcapillary or liquid chromatography with 
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tandem mass spectrometry (LC-MS/MS). LC- 
MS/MS was performed using an Easy-nLC 
nanoflow HPLC (Proxeon Biosciences) with a 
self-packed 75 um id x 15 cm C18 column cou- 
pled to a LTQ-Orbitrap XL mass spectrometer 
(Thermo Fisher Scientific) in the data-dependent 
acquisition and positive ion mode at 300 nL/min. 
Peptide ions from predicted phosphorylation 
sites were also targeted in MS/MS mode for 
quantitative analyses. MS/MS spectra collected 
through collision-induced dissociation in the ion 
trap were searched against the concatenated 
target and decoy (reversed) single entry and 
full Swiss-Prot protein databases using Sequest 
(Proteomics Browser Software, Thermo Fisher 
Scientific) with differential modifications for 
Ser/Thr/Tyr phosphorylation (+79.97) and the 
sample processing artifacts Met oxidation 
(+15.99), deamidation of Asn and Gln (+0.984) 
and Cys alkylation (+57.02). Phosphorylated 
and unphosphorylated peptide sequences were 
identified if they initially passed the following 
Sequest scoring thresholds against the target 
dtabase: 1+7 ions, Xcorr = 2.0 Sf = 0.4, P = 5; 
2+ ions, Xcorr = 2.0, Sf = 0.4, P = 5; 3+ ions, 
Xcorr = 2.60, Sf = 0.4, P = 5 against the target 
protein database. Passing MS/MS spectra were 
manually inspected to ensure that all b- and 
y-fragment ions aligned with the assigned se- 
quence and modification sites. Determination of 
the exact phosphorylation sites was aided using 
Fuzzylons and GraphMod and phosphorylation 
site maps were created using ProteinReport 
software (Proteomics Browser Software suite, 
Thermo Fisher Scientific). False discovery rates 
(FDRs) of peptide hits (phosphorylated and 
unphosphorylated) were estimated below 1.5% 
based on reversed database hits. 


Seahorse assays 


OCR of cells were measured using the Sea- 
horse XF96 Cell Mito Stress Test Kit (Seahorse 
Biosciences) with an XF96 Extracellular Flux 
Analyzer (Seahorse Bioscience) in accordance 
with the manufacturer’s instructions. Seahorse 
XF Cell Mito Stress Test Kit (Cat# 103015-100) 
and Seahorse xFe96 FluxPak (Cat# 102416-100) 
were purchased from Agilent Technologies. The 
XF96 cell culture microplates were polylysine 
coated before seeding HEK293T cells in XF 
base media supplemented with 1 mM pyru- 
vate, 2 mM glutamine, and 10 mM glucose. 
Each condition was seeded in six replicate 
wells. The Seahorse sensor cartridge was hy- 
drated overnight in a non-CO, incubator at 
37°C. On the day of the assay, cells were incu- 
bated in a CO,-free incubator at 37°C for 1 hour 
to allow for temperature and pH equilibration 
before loading into the XF96 apparatus. Mito- 
chondrial stress tests were conducted following 
Seahorse guidelines (Agilent Technologies). 
Inhibitors were used at the following concen- 
trations: 1 uM oligomycin, 1 u.M FCCP, and 
0.5 uM antimycin A + 0.5 uM rotenone. Analyses 
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were conducted using Wave software (Agilent 
Technologies). 


mRNA extraction and gPCR 


RNA extracted from cells using QIAGEN rNeasy 
Plus mini kit (Cat# 74134) and 1 ug of RNA was 
reverse transcribed using Iscript cDNA Syn- 
thesis Kit (Bio-Rad, Cat# 1708891). qPCR used 
diluted cDNA, relevant primers, and SYBR 
Green PCR master mix (Thermo Fisher Scien- 
tific, Cat# 4309155) in a C1000 Thermal Cycler 
(BioRad). Relative mRNA levels calculated 
using the AAC; method, with B-actin serving 
as the internal control. All mRNA measure- 
ments performed in triplicate. All primers 
are listed in table S1. 


Mitochondrial DNA content analysis 


Mitochondrial DNA was quantified by deter- 
mining the mtDNA/nDNA ratio. Total genomic 
DNA was extracted from cells using the Qiagen 
dNeasy Blood & Tissue Kit (Cat# 69504). Ex- 
tracted DNA was diluted to a final concentration 
of 10 ng of DNA per microliter, and mitochon- 
drial DNA was quantified relative to the nuclear 
DNA specific gene f-actin by qPCR. Primers 
used for B-actin and mitochondrial 16S rRNA 
genes are listed in table S1. Reactions used SYBR 
Green PCR Master Mix (Thermo Fisher Scien- 
tific), using a C1000 Thermal Cycler (BioRad). 
The relative mtDNA copy number was calcu- 
lated using the AAC; method. 


RNA-seq 


Cells were administered with 991, CCCP, rote- 
none, or phenformin time courses, ranging 
from 0 to 24 hours. Each time point and con- 
dition was prepared in triplicate. Total RNA 
was isolated using the QIAGEN rNeasy Plus 
mini kit. The quality of the isolated total RNA 
was assessed using Agilent TapeStation 4200 
and RNA-seq libraries were prepared with 
500 ng of total RNA using the TruSeq stranded 
mRNA Sample Preparation Kit according to 
the manufacturer’s protocol (Illumina). Libra- 
ries were quantified, multiplexed, and pooled 
for sequencing at paired-end 75 base pairs using 
the Illumina NextSeq500 or NovsSeq6000 plat- 
form at the Salk Next-Generation Sequencing 
Core. Raw sequencing data were demultiplexed 
and converted in the FASTQ files using CASAVA. 
(version 1.8.2). Libraries were sequenced at 
an average depth of 12 to 40 million reads 
per sample. 


Bioinformatic analysis of RNA-seq data 


Raw RNA-seq reads in FASTQ files were quality- 
tested using FASTQC (v0.11.8) (Andrews 2010) 
and mapped to the human reference genome 
(GRCh38) with STAR (v2.5.3a) aligner with de- 
fault parameters (67). Raw or TPM (transcripts 
per million) gene expression levels were quan- 
tified across all the exons of the top isoform in 
RefSeq with analyzeRepeats.p] in HOMER 
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(v4.1L1) (68). Differential expression analysis was 
performed on the Biojupies platform (https:// 
amp.pharm.mssm.edu/biojupies/) (69) with 
raw gene counts, comparing gene expression 
levels between control and experimental groups 
using the limma R package (70). Replicates were 
used to compute within-group dispersion and 
correction for batch effects. Volcano plots were 
generated to display the results of differential 
gene expression analysis using VolcanoseR (77). 
Gene fold changes were log, transformed and 
displayed on the x axis; P values were cor- 
rected using the Benjamini-Hochberg method, 
transformed using —logj, and displayed on the 
y axis. Differentially expressed genes were 
defined as having a P value < 0.05; FC = 1.3. 
Red points indicate significantly up-regulated 
genes, and blue points indicate significantly 
down-regulated genes. Raw gene counts were 
first normalized to TPM (transcripts per mil- 
lion). TPM counts were log transformed, filtered, 
scaled, and centered before gene clustering and 
heatmap generation using ClustVis (72), Heat- 
mapper (73), or the R package (pheatmap). 
ClustVis uses code from BoxPlotR; several R 
packages are used internally, including shiny, 
ggplot2, pheatmap, gridSVG, rColorBrewer, 
FactoMineR, pcaMethods, gProfileR, shinyBS, 
shinyjs, and others. The source code of ClustVis 
is available in GitHub. For enrichment analy- 
ses, the up-regulated and down-regulated gene 
sets were generated by extracting genes with the 
respectively highest and lowest values from the 
gene expression signature. The gene sets were 
subsequently submitted to Enrichr (74), which is 
freely available at http://amp.pharm.mssm.edu/ 
Enrichr/. The following libraries were used for 
the analysis: GO_Cellular_Component_2018, 
WikiPathways_2016, and ChEA_2016. Signifi- 
cant terms were determined by using a cut-off 
of P < 0.1 after applying Benjamini-Hochberg 
correction. GSEA was carried out with GSEA 
desktop v4.0.3 using preranked lists generated 
from FDR values, setting gene set permuta- 
tions to 1000, using the Hallmark or the c2 col- 
lections in mSigDB v7.2 or “KEGG Lysosome” 
gene sets. Four-way Venn diagrams were gen- 
erated using Venny 2.1 (https://bioinfogp.cnb. 
csic.es/tools/venny/). 


Statistical analysis 


Statistical parameters including the exact val- 
ue of m, measures (means + SEMs), and sta- 
tistical significance are reported in the figures 
and figure legends. Data are judged to be 
statistically significant when P < 0.05 by two- 
tailed Student’s ¢ test. In figures, asterisks de- 
note statistical significance as calculated by 
Student’s ¢ test (*P < 0.05; **P < 0.01; ***P < 
0.001; ****P < 0.0001). Statistical analyses 
were performed using Graph Pad Prism 7. 
Analysis of RNA-seq data has been described 
in the section “Bioinformatic analysis of RNA- 
seq data.” 
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INTRODUCTION: Despite recent therapeutic ad- 
vances, new treatments are needed for mel- 
anoma patients. Difficult-to-treat melanoma 
subsets rely on oncogenic gene expression for 
growth and therapy resistance. In normal cells, 
gene expression is tightly regulated by RNA 
surveillance pathways. Newly described mech- 
anisms detect and degrade unnecessary RNA 
species in the nucleus to ensure gene expres- 
sion quality. 


RATIONALE: Transcriptional cyclin-dependent 
kinases (CDKs) are a family of kinases that 
have roles in directly controlling gene expres- 
sion. We examined the transcriptional CDK 
loci in publicly available melanoma patient 
data to investigate whether any of these ki- 
nases could be a therapeutic target for mela- 
noma. Unexpectedly, we observed an enrichment 
of CDK13 kinase domain mutations in melanoma, 
suggesting that CDK13 is a dominant-negative 
tumor suppressor. Therefore, we investigated 
the mechanism of mutant CDK13-mediated 
oncogenesis. 


» J Mutated 
: CDK13 


= | 

> 

> 
& 
Recurrent 
mutations 


RESULTS: CDK73 is mutated in 3.9% of cuta- 
neous melanomas, and these mutations are 
selected for as measured by a computational 
tool that considers mutational load and se- 
verity. Kinase domain mutations were en- 
riched compared with mutations in the rest 
of the protein in melanoma (2.2-fold) and 
other cancers (1.8-fold), suggesting dominant- 
negative activity. The CDK13 kinase domain 
mutations largely overlap with mutations that 
cause a CDK13-related developmental disorder. 
Expression of kinase-mutated CDK13 expedited 
melanoma onset in a zebrafish melanoma 
model and caused human melanoma cells to 
be more proliferative, demonstrating CDK13’s 
dominant-negative tumor-suppressive function. 

Because CDK13 is related to known tran- 
scriptional kinases, we tested for a global gene 
expression phenotype by quantifying differen- 
tial exon usage. We found that CDK13 mutant 
expression or loss of function resulted in in- 
creased expression of first compared with last 
exons, indicating the accumulation of short 
RNAs. Using a specialized sequencing tech- 
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Prematurely terminated 
transcripts degraded 


Prematurely terminated 
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: Prematurely terminated transcripts 
sufficient to drive oncogenesis 


Deficient nuclear RNA surveillance is oncogenic. CDK13 phosphorylates ZC3H14 to activate 

nuclear RNA surveillance on protein-coding genes. Mutant CDK13 fails to phosphorylate ZC3H14, 

and aberrant RNAs are stabilized. CDK13 has properties consistent with a tumor suppressor, as shown 
from patient data and zebrafish melanoma models. Recurrent mutations in nuclear RNA surveillance 
genes were identified in cancer. The expression of ptRNAs is sufficient to drive melanomagenesis. 
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| 
¢ 
nique, we defined the 3’ end of RNAs, w] Si 
showed that mutant CDK13 zebrafish mv. — 
nomas and human melanoma cells had 
increased prematurely terminated RNAs 
(ptRNAs) ending in introns. We found that 
ptRNAs accumulate posttranscriptionally 
through lack of degradation using digital 
droplet PCR and nascent RNA sequencing. 
Because ptRNAs are capped, spliced, and 
polyadenylated, we performed whole-cell 
proteomics of mutant CDK13 versus control 
zebrafish melanomas and found translation 
of some ptRNAs, including their intronic re- 
gions, which could be a source of neoantigens. 

To elucidate the mechanism of mutant 
CDK13 oncogenesis, we immunoprecipitated 
CDK13 and discovered binding to proteins 
associated with the polyA tail exosome tar- 
geting (PAXT) complex, which targets ptRNAs 
for degradation in the nucleus. We immuno- 
precipitated ZC3H14 from CDK13“" and 
CDK13™" human melanoma cells and found 
that ZC3H14 lacks phosphorylation at S475 
in CDK13™" cells, suggesting that CDK13 
directly phosphorylates ZC3H14 S475. We 
rescued PAXT recruitment and activity with 
expression of a ZC3H14 phosphomimetic 
mutant in CDK13™"' cells. We also showed 
that the expression of a nonphosphorylat- 
able ZC3H14 decreased PAXT recruitment 
and activation in CDK13“” cells. CDK13 ac- 
tivated by CCNT1 was able to in vitro phos- 
phorylate ZC3H14"", but not ZC3H145*", 
showing that CDK13 can directly phosphoryl- 
ate ZC3H14 on the relevant residue. 

We found that ptRNAs accumulated in other 
CDK13™" cancers. To determine whether 
ptRNA expression is sufficient to expedite 
oncogenesis, we expressed two human 
ptRNAs in the zebrafish model and found 
that they both expedited melanoma onset. 
Finally, we found that additional nuclear RNA 
surveillance components are recurrently mu- 
tated in cancer. These experiments show that 
ptRNAs can promote cancer phenotypes and 
that components surveilling aberrant RNAs 
are mutated in cancer. 


CONCLUSION: Our work shows that CDK13 has 
properties consistent with a tumor suppressor, 
and that mutant CDK13 is oncogenic because 
of deficient RNA surveillance. Our finding that 
recurrent mutations occur in additional PAXT 
components suggests a broad, previously un- 
recognized tumor-suppressive role for nuclear 
RNA surveillance. 
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RNA surveillance pathways detect and degrade defective transcripts to ensure RNA fidelity. We found 
that disrupted nuclear RNA surveillance is oncogenic. Cyclin-dependent kinase 13 (CDK13) is mutated in 
melanoma, and patient-mutated CDK13 accelerates zebrafish melanoma. CDK13 mutation causes 
aberrant RNA stabilization. CDK13 is required for ZC3H14 phosphorylation, which is necessary and 
sufficient to promote nuclear RNA degradation. Mutant CDK13 fails to activate nuclear RNA surveillance, 
causing aberrant protein-coding transcripts to be stabilized and translated. Forced aberrant RNA 
expression accelerates melanoma in zebrafish. We found recurrent mutations in genes encoding nuclear 
RNA surveillance components in many malignancies, establishing nuclear RNA surveillance as a tumor- 
suppressive pathway. Activating nuclear RNA surveillance is crucial to avoid accumulation of aberrant 
RNAs and their ensuing consequences in development and disease. 


ranscriptional cyclin-dependent kinases 
(CDKs) are a family of kinases activated 
by a cyclin-binding partner (J, 2) that 
have roles in controlling transcriptional 
subprocesses including initiation, elon- 
gation, co-transcriptional RNA processing, and 
termination (3-9). Transcriptional CDKs, in- 
cluding CDK13, are being investigated as 
therapeutic targets for many difficult-to- 
treat cancers (J0-14). Mutations affecting 
CDK13 cause syndromic developmental dis- 
orders that affect neural crest-derived tissues 
(congenital heart defects, dysmorphic facial 
features, and intellectual development) (15-19). 
The role of CDK13 in transcription and RNA 
processing remains poorly understood. 
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RNA polymerase II (RNAPII) is exquisitely 
regulated to ensure that transcriptional initi- 
ation, elongation, co-transcriptional process- 
ing, and termination are precisely executed 
on protein-coding genes. During elongation, 
a rapid cascade of RNA-processing steps takes 
place (20). RNA-processing errors can result in 
aberrant transcripts that must be degraded 
by RNA surveillance pathways. Messages that 
prematurely terminate are predicted to evade 
nonsense-mediated decay because of the lack 
of downstream splice events (27), and require 
a complementary surveillance pathway to 
avoid accumulation and translation of path- 
ogenic transcripts. 

Aberrant or unstable nuclear RNAs are rec- 
ognized by adaptor complexes, which recruit 
the MTR4 helicase and the nuclear RNA 
exosome degradation complex to specific 
RNAs. The recently reported polyA exosome 
targeting (PAXT) complex targets prematurely 
terminated RNAs (ptRNAs) for degradation 
(22, 23). How ptRNAs (and not mRNAs) are 
specifically targeted for PAXT degradation is 
unknown. 

In this study, we found that patient-specific 
mutations in CDK13 accelerate oncogenesis in 
human cells and in a zebrafish melanoma 
model, and that CDK13 is responsible for 
activating nuclear RNA surveillance. When 
CDK13 is mutated or lost, phosphorylation 
of the PAXT component ZC3H14 is compro- 
mised, and ptRNAs fail to be targeted for 
nuclear degradation. Consequently, stabi- 
lized ptRNAs are translated into truncated 
proteins. Overexpression of ptRNAs leads 
to accelerated melanoma in zebrafish. We 
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found that PAXT subunits recruited by CDK13 
phosphorylation are mutated in 17% of mela- 
nomas, and that recurrent mutations in PAXT 
subunits ZFC3H1 and ZC3H18 occur in many 
cancer types. This work demonstrates that 
CDK13 normally activates nuclear RNA sur- 
veillance machinery to degrade oncogenic 
ptRNAs. 


Results 
CDK13 has properties consistent with a 
tumor suppressor 


We initially examined the transcriptional CDK 
loci as defined by (2) in melanoma patient 
samples from The Cancer Genome Atlas 
(TCGA) (24) and found that CDK13 was mu- 
tated in 4.6% of patients and had a cluster of 
mutations near the ATP-binding site (table S1). 
We then analyzed a larger set of publicly 
available patient data (24-33) specifically for 
CDK13 mutations and found that CDKI3 is 
somatically mutated in 3.9% of cutaneous 
melanomas (52 of 1347 patients) (fig. SLA and 
table S2). To determine whether CDK13 soma- 
tic mutations are selected for in melanoma, 
OncodriveFM (34) was used because it accounts 
for background mutational load and predicted 
mutational severity. This analysis suggested 
that CDK13 is a significant melanoma driver 
gene (g = 0.042, P = 0.0049). We observed 2.2- 
fold enrichment of deleterious mutations in 
the kinase domain versus the remainder of 
the protein (Fisher’s exact test, P = 0.03, 
odds ratio = 5.4) (Fig. 1A, red and orange; fig. 
S1A; table $2; and data S1). To assess CDK13 
mutational clonality, we identified samples that 
were copy neutral for CDK13 and compared 
the variant allelic fraction of CDK13 mutations 
against known driver mutations from the same 
samples (24). Of six evaluable tumors, four were 
clonal (P746T, R860Q, E1248K, and P881L), one 
was subclonal (R1206M), and one was poten- 
tially subclonal (E1448X) with a variant al- 
lelic fraction similar to NFI (fig. S1B and data 
S1). In non-copy-neutral CDK13 mutant sam- 
ples, we observed additional samples harboring 
clonal or potentially clonal CDK73 mutations 
(9 of 12). Of the seven samples with clonal 
CDK13 mutations, all were heterozygous. CDK13 
is mutated in many other malignancies (fig. 
SIC), which also display an enrichment in ki- 
nase domain mutations (1.8x, n = 13269 from 
56 studies, P = 2.47 x 10°") (data S1). Loss of 
heterozygosity is infrequently observed in 
melanoma. Of the 1347 melanoma patient sam- 
ples, we assessed those with copy number and 
mutational information for loss of heterozy- 
gosity and found that 0.5% had biallelic loss 
(n = 832, one biallelic deletion, and three 
with a mutation and heterozygous deletion). 
These data show that CDK13 is a mutated 
driver gene with a modest enrichment in ki- 
nase domain mutations, suggesting that kinase 
domain mutations have additional selection 
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Fig. 1. CDK13 has properties consistent with a tumor suppressor. (A) CDK13 
melanoma mutation plot. Red asterisk indicates the phosphorylation site. 

(B) Patient survival plot with CDK13 mutation or down-regulation versus 
remaining patients. P = 0.0028, log-rank test. n = patients. (C) Patient kinase- 
domain mutations mapped on the CDK13 crystal structure. pT871 is the 
phosphorylation site. (D) In vitro kinase assay of wild-type (WT) and patient- 
mutated CDK13 using full-length CTDs2 as the substrate. One-way ANOVA 

with no kinase versus all conditions was used. WT CDK13, ****q = 0.0001; all 
other comparisons were nonsignificant. Data are shown as means + SD. 

(E) Representative photos of mitfa:BRAF’°°";p53°’:mitfa’~ (Triples melanoma 
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model zebrafish) with control guide RNA (gRNA), cdk13 gRNA, human 
CDK13"", or cdk13 gRNA and human CDK13"" at 4 weeks after fertilization. 
(F) Quantification of melanocytes at 3 days after fertilization for Triples zebrafish 
injected with control gRNA, cdk13 gRNA, human CDK13", or cdk13 gRNA and 
human CDK13"". One-way ANOVA, multiple comparisons. Data are shown as 
means + SD. *q = 0.0186, **q = 0.0030, ****q < 0.0001. n = number of 
zebrafish. (G@) PH3 antibody staining per square millimeter of melanoma from 
Triples zebrafish injected with cdk13 gRNA compared with control gRNA. 

P = (0.014 (Mann-Whitney two-tailed t test). n = number of melanomas. (H) Nine- 
week photos of melanocyte-specific expression of EGFP, human CDK13"", 
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pressure beyond loss of function, such as 
dominant-negative activity. 

De novo heterozygous CDK13 kinase-domain 
mutation has been reported to cause a CDK13- 
associated developmental syndrome (80%, 32 
of 40 patients) (15-19), suggesting the func- 
tional importance of these mutations. All but 
two of the CDK13-developmental kinase do- 
main variants are found as somatic mutations 
in cancer. The notable difference is the N842 
variant, the most common variant associated 
with the developmental disorder, which was 
not found as a somatic mutation in melanoma 
or nonmelanoma cancer (data S1). Although 
CDK13 was not considered a driver gene in 
prior studies because of its low mutational 
rate, these data show that CDK13 is a broadly 
used driver gene with an enrichment in del- 
eterious kinase-domain mutations, suggesting 
a function beyond haploinsufficiency. 

If mutant CDK13 functions by interfering 
with WT CDK13, in other words has dominant- 
negative activity, then mutant CDK13 would 
be expected to result in a phenotype similar 
to CDK13 down-regulation. Survival analysis 
using human melanoma patient (24) data from 
TCGA revealed that patients with CDK13 down- 
regulation (zg score <-1.0, fold change 0.48) or 
somatic mutation (CDKI3 altered) combined 
had decreased overall survival (Fig. 1B and 
fig. S1, D and E; log-rank P = 0.0028). CDK13 
alteration remained associated with poor sur- 
vival after adjustment in a multivariate model 
(homologous recombination repair = 1.81; 95% 
confidence interval, 1.13 to 2.90; Cox P = 0.014; 
table S3). Patients initially staged with 0/1/2 
melanoma with CDK7/3 alterations exhibited 
reduced overall survival compared with remain- 
ing stage 0/1/2 melanoma patients (fig. S1F; 
log-rank P = 0.0012). The percentages of CDK13- 
altered (mutated or down-regulated) mela- 
nomas with B 600 mutations or NJ ie 
were compared with the remaining cases in 
the TCGA melanoma cohort (24), and no sig- 
nificant differences were found (chi-square 
test, P = 0.287) (Table 1). These data suggest 
that CDK13 mutation or down-regulation is 
associated with poor overall survival in mela- 
noma patients. 

CDK13 melanoma mutations that occur in 
the kinase domain were projected onto a pre- 
viously solved crystal structure (Fig. 1C) (35). 
The P869 residue is located two residues away 
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patient-mutant CDK13, or CDK13“”* (catalytically dead control) in Triples 
zebrafish. Arrows indicate melanomas. (I) Percentage melanoma-free survival of 
Triples zebrafish with melanocyte-specific expression of EGFP, CDK1 
patient mutation), and CDK13*’"4" (catalytically dead). ****P < 0.0001 (log- 
ank). n = number of zebrafish. (J) PH3 antibody staining/square millimeter of 
melanoma from Triples zebrafish expressing EGFP versus CDK13 mutant (W878L 
or P893L). P = 0.0125 (Mann-Whitney test, two-tailed). n = number of 
melanomas. (K) Melanoma-free survival of zebrafish with melanocyte-specific 
expression of CDK13“8’*" with melanocyte-specific CRISPR of either a control 
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gene or ccnT1. P = 0.0098 (log-rank). n = number of zebrafish. (L) Melanoma- 
free survival of zebrafish with melanocyte-specific CRISPR of a control gene or 
cenT1 alone. ns, nonsignificant (log-rank). n = number of zebrafish. (M) Doubling 
time for human melanoma cells expressing CLOVER, CDK13“", CDK13“875-, 

or CDK13"85°® One-way ANOVA with CLOVER versus all conditions. CDK13“8”8", 
**q = 0.0044; CDK13"860° ***q = 0.0009. ns, nonsignificant. Data are shown 
as means + SD. n = 3 biologic replicates. For all box plots, the box is the 25th 
to 75th percentile range, the solid line represents the median, and whiskers 
show maximum to mininum. 


ee 
Table 1. CDK13 alterations occur in all genetic subtypes in melanoma TCGA patients. 


CDK13 mutated and down-regulated* 
36.4% (n = 5 mutated, n = 7 down-regulated) 


5 mutated, n = 7 down-regulated) 


27.3% (n = 2 mutated, n = 7 down-regulated) 


BRAF V600 mutations 


NRAS Q61 mutations 36.4% (n 


All other casest 
50.8% (n = 129) 


26.8% (n = 68) 


22.4% (n = 57) 


*Percentage of cases in N = 33 of 287. {Percentage of cases in N = 254 of 287 patients with DNA mutational and RNA 


expression data available. 


from the T871 phosphorylation site, and the 
R860 residue normally coordinates with 
phospho-T$871, so these mutations could dis- 
rupt kinase activation. The W878 mutation 
may change the substrate-binding pocket, and 
mutants P881L, P893L, and I1843N might dis- 
rupt the integrity of the kinase domain struc- 
ture. In vitro, CDK13"" activated by its canonical 
cyclin partner, cyclin K (CCNK), had robust 
kinase activity, whereas the R860Q, W878L, 
and K734R mutants failed to phosphorylate 
full-length RNAPII C-terminal domain (CTD;.) 
and a second substrate (Fig. 1D and fig. S1G), 
but maintained cyclin binding (fig. S1G, right). 
The CDK13*4" mutation replaces the lysine 
required for catalysis (36) and occurs in the 
CDK13-associated developmental syndrome (15). 
These data show that CDK13 mutations ob- 
served in melanoma abrogate kinase activity. 

To examine the role of CDK13 in melanoma, 
the MAZERATI rapid modeling system was 
used with mitfa:BRAF 53" ;mitfa- 
zebrafish, hereafter referred to as the “Triples” 
zebrafish melanoma model. These zebrafish 
lack Mitfa, the master regulator of melanocyte 
development, and also lack melanocytes (37-39). 
Melanocytes were rescued with a vector that 
expresses Mitfa, allowing cell-autonomous 
melanocyte-specific gene expression or CRISPR/ 
Cas9-mediated inactivation using the mitfa 
promoter. Zebrafish with melanocyte-specific 
CRISPR-deletion of zebrafish cdk13 compared 
with that of a control gene showed signifi- 
cantly decreased melanocyte numbers (one- 
way ANOVA g = 0.0030), indicating that cdk13 
is required for melanocyte development (Fig. 1, 
E and F, and fig. S1, H and I). Overexpression 
of human wild-type CDK13 (CDK13™") also 
resulted in fewer melanocytes during devel- 
opment (Fig. 1, E and F, and fig. S1J) but was 
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able to rescue melanocyte number loss caused 
by cdk13 deletion (Fig. 1, E and F, and fig. S1K). 
Rarely, melanomas arose in zebrafish with 
cdk13 melanocyte-specific CRISPR-deletion, 
and phospho-histone 3 (PH3) immunohisto- 
chemical staining revealed that cdk13 CRISPR- 
deleted versus control melanomas were 
significantly more proliferative (Mann-Whitney 
two-tailed ¢ test, P = 0.014) (Fig. 1G and fig. S1, 
Land M). Proper Cdk13 levels are required for 
melanocyte development and human CDK13 
complements the loss of zebrafish cdk13. 

To test whether CDK13 melanoma patient 
mutations cause more aggressive melanoma 
in vivo, mutant CDK13 was expressed in 
melanocytes in the Triples zebrafish melanoma 
model. Control enhanced green fluorescent 
protein (EGFP) expression resulted in expected 
rescue of mosaic stripes, whereas human 
CDK13”" expression resulted in few melano- 
cytes at 9 weeks after fertilization, so CDK13™"- 
expressing animals were not followed for 
tumor onset. By contrast, overexpression 
of CDK13"°°2, CDK13°8°S, CDK13 W878", 
CDK13"%"™, or CDK13"5°" caused the ap- 
pearance of black patches at 9 weeks after 
fertilization (Fig. 1H and fig. SIN) and ex- 
pedited tumor onset (Fig. 1] and fig. $1, O and 
P). Because all tested CDK13 mutations pro- 
moted melanoma to a similar degree, we have 
used them interchangeably in most assays 
(hereafter called CDK13™"). CDK13™"-expressing 
melanomas had more PH3-positive cells, as 
shown by immunohistochemistry, than controls 
(Fig. IJ and fig. S1Q). Because both melanocyte- 
specific cdk13 CRISPR deletion and forced 
human CDK13™" expression caused more 
proliferative melanomas, our data support that 
CDK13™" acts through a dominant-negative 
or antimorphic mechanism in which human 
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CDK13™" adversely affects intact zebrafish 
cdk13"" activity. Our data are consistent with 
existing human developmental disorder genetics 
showing that CDK13 mutations work through 
a dominant-negative mechanism (15-19). 

We hypothesized that cyclin binding is re- 
quired for CDK13™" dominant-negative activity. 
To test whether CCNK (35, 40-42) or cyclin T1 
(CCNTI) are required for CDK13™" dominant- 
negative melanomagenesis in vivo, we coin- 
jected vectors that express melanocyte-specific 
CDK13™" and that delete cenK or cenT!1 in 
the Triples zebrafish melanoma model. ccnK 
melanocyte-specific inactivation expedited mela- 
noma in the presence of CDK13“°"", but not 
alone (fig. S1, R and S). ccnT7 melanocyte-specific 
inactivation suppressed CDK13“°*" and 
CDK13"°©2 oncogenesis but had no effect 
alone (Fig. 1, K and L, and fig. $1, T and U). 
These data indicate that the oncogenic mech- 
anism of CDK13™" requires CCNTI. 

To determine the effect of mutant CDK13 
expression in human melanoma cells, CLOVER 
fluorescent protein, CDK13“", CDK13“**", or 
CDK13"°°2 was expressed in human A375 
melanoma cells. Cells expressing either CDK13 
mutant had a decreased doubling time (in- 
creased growth rate) compared with CDK13“" 
or CLOVER (Fig. 1M and fig. SIV). These data 
show that mutant CDK13 expression causes 
human melanoma cells to proliferate more 
quickly, indicating that mutant CDK13’s pro- 
proliferative function is conserved. 


CDK13 mutation results in accumulation of 
RNAs that prematurely terminate in introns 


Because CDK13 is phylogenetically related to 
known transcriptional kinases, we tested for 
a global RNA phenotype by quantifying dif- 
ferential exon usage. Differential expression 
analysis of first (F), alternative first (AF), in- 
ternal (I), last (L), or alternative last (AL) 
exons in each gene was performed in zebra- 
fish melanomas. We observed significantly 
increased read coverage in the first exon 
compared with the last exon in CDK13™"- 
expressing versus control melanomas (two- 
sided Wilcoxon rank-sum test P < 2.2 x lo) 
(Fig. 2A and fig. S2A). These data suggest that 
ptRNAs are present in CDK13™" zebrafish 
melanomas. 

Analysis of poly-A-selected RNA-sequencing 
(RNA-seq) from patient melanomas (patient 
characteristics in table S4) showed that 
patients with CDK13™"'-expression versus 
CDK13" expression had increased read cov- 
erage in the first exon versus the last exon 
(Fig. 2B), indicating the presence of ptRNAs. 
ptRNA evidence was of greater magnitude in 
the patient RNA-seq data that were poly-A 
selected compared with non-polyA selected, 
indicating that ptRNAs are polyadenylated 
(compare Fig. 2, B versus A). We generated 
mouse embryonic stem cells (mESCs) that 
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were Cdk13"/~ and carried a doxycycline- 
inducible mouse Cdk13 rescue transgene. 
PolyA-selected RNA-seq was done at time 
zero (CDK13 expressed) and at 48 and 72 hours 
after doxycycline removal (CDK13 depleted) 
(fig. S2, B to D). mESCs that were CDK13 de- 
pleted for 48 hours showed evidence of 
accumulated ptRNAs, with this effect inten- 
sifying at 72 hours after depletion (Fig. 2C). 
Accumulated ptRNAs in CDK13-depleted mESCs 
were largely nonoverlapping from CDK12- 
depleted mESCs generated analogously (7) 
(fig. S2E and data S1). These data show that 
CDK13’s role in preventing the accumulation 
of polyadenylated ptRNAs is conserved. 

3' RNA-seq was used to determine where 
ptRNAs terminate in CDK13™" melanomas. 
3’ sequencing was executed on Triples zebra- 
fish melanomas with melanocyte-specific ex- 
pression of EGFP (n = 3) or mutant CDK13™™" 
(n = 3) and human melanoma A375 cells 
expressing CLOVER (n = 2) or CDK13™" (n = 2). 
A total of 83,660 overall recurrent termination 
events were identified in zebrafish melanomas 
and 47,787 overall termination events were 
identified in human melanoma cells. We used 
DEXSeq (43) to quantify differential usage of 
polyA termination sites between CDK13™™ 
and eGFP condition in zebrafish melanomas, 
which identified 802 significantly different 
termination events (g < 0.05). Parallel analy- 
sis from human melanoma cells revealed 1678 
significantly different termination events (g < 
0.05) in CDK13™" compared with control 
CLOVER cells. To determine whether con- 
served genes or pathways were affected in 
zebrafish and human melanomas, the affected 
zebrafish genes were transferred to the nearest 
ortholog and the gene lists were overlapped. 
Only 3.4% of genes were found in both lists, 
and gene ontology analysis showed no en- 
riched pathways using the “biologic process 
complete” annotation set. Given the lack of 
shared functional pathways, these data sug- 
gest that the proliferative phenotype observed 
in CDK13 mutant patient melanomas, zebra- 
fish melanomas, and human melanoma cells 
may be promoted by cellular stress from ac- 
cumulated truncated RNAs, as opposed to trun- 
cation of a specific conserved target gene(s). 

A significant increase in intronic cleavage 
sites was observed in CDK13™" zebrafish mela- 
nomas (Fig. 2D) and human melanoma cells 
(Fig. 2E). In zebrafish melanomas, the signifi- 
cantly changed sites were enriched in introns 
(median fold change 2.54) and depleted from 
untranslated regions (UTRs) (median fold 
change -1.51) (P < 2.2 x 10°'°, two-sided 
Kolmogorov-Smirnov test); and in human 
melanoma cells, the significantly changed sites 
were also enriched in introns (median fold 
change 0.38) and depleted from UTRs (median 
fold change -1.67) (P = 9.99 x 107'°, two-sided 
Kolmogorov-Smirnov test). In CDK13™™" human 
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melanoma cells, SUV39H1 and CBFB each 
showed an up-regulated ptRNA that termi- 
nated in the third intron (Fig. 2F and fig. S2F), 
whereas 7P53 had an up-regulated ptRNA 
that terminated in the first intron (fig. S2G). 
For many up-regulated intronic polyadeny]l- 
ation sites, a canonical RNA polyadenylation 
site motif was observed (Fig. 2F and fig. $2, F 
and G). These data show that CDK13™"" causes 
accumulation of ptRNAs generated from exist- 
ing cleavage and polyadenylation sites found in 
intronic DNA in zebrafish and humans. 

ptRNAs could accumulate from increased 
production or decreased clearance in CDK13 
mutant cells. If transcription elongation is 
impaired, then ptRNAs should be produced 
at a higher rate at the expense of full-length 
mRNA synthesis. In this case, the absolute 
amount of first exon expression in CDK13 
mutant cells should be unchanged compared 
with control, whereas the last exon expression 
should be decreased (Fig. 2G, model 1). Al- 
ternatively, if ptRNA clearance is decreased 
in CDK13™" cells, then ptRNA concentration 
should rise without affecting full-length mRNA 
expression. The first exon expression should 
increase, whereas the last exon expression 
should remain stable (Fig. 2G, model 2). We 
hypothesized that absolute quantification of 
RNA species could help to distinguish be- 
tween these two models. 

To measure concentration of ptRNAs and 
full-length mRNAs, digital droplet polymerase 
chain reaction (ddPCR) was performed for 
four genes with up-regulated ptRNAs and two 
control genes, which were identified from the 
3' sequencing described above. The signal from 
first exon probes represented both ptRNAs and 
full-length mRNAs, whereas the signal from 
the last exon probes represented full-length 
mRNAs alone (Fig. 2H and fig. S2H). The fold 
change between CDK13™™" and control cells 
showed that all genes with predicted ptRNAs 
had an increase in first exon expression, con- 
sistent with ptRNA accumulation in CDK13™™" 
cells (CBFB, LATS2, SUV39H1, and TP53). 
Neither control gene showed evidence of 
accumulated ptRNAs (Clorj35 and AGPATI). 
All six genes showed unchanged last exon ex- 
pression in CDK13™" cells compared with 
control cells (Fig. 2H). These data verify in- 
creased ptRNA levels in human CDK13™" hu- 
man melanoma cells. Last exon expression 
was intact for all genes tested, showing that 
transcription is intact and suggesting that 
ptRNAs accumulate in CDK13™" cells through 
lack of clearance (Fig. 2G, model 2). 


ptRNAs accumulate posttranscriptionally and 
are translated, including intronic sequences 


To directly examine transcription in CDK13"™"- 


expressing human melanoma cells, genome- 
wide nascent RNA production was measured 
with transient transcriptome sequencing (TT-seq) 


4 of 14 


RESEARCH | RESEARCH ARTICLE 
A Zebrafish melanomas B Patient melanomas Cc Mouse ES Cells 
eo] 48hrs 72hrs 
=—— TT + RK KEK KKK 
oa, TT + 
sd I 1 \ ' — ~~ H 
So i ' + ' \ xs ' 
@ ! 2 On] Sipok se 
Do iat: i Bao oy}; oi oT 
OL fl i ! o Ss \ e MD > \ 1 i 
5o ae — Di \ ax J]! ‘ ! 
° WM : pa = | iat Ox ' i ' 1 
eS o mm-. 35 ae 56 es _- 
cE of : nt ~ os 1 : x co | a Eo ° — 
Be | =o A = “i ogol- | Ee = co; = 
oe 22° | | oa ha 1 
26 a en Sa) i P| = Ho 
g poof yt Zo |! | < = 
' ! rol fl ! N N a 
a Ft 4 a igl 8s] oe] 
T eis, 1 \ ! Oo | 1 
eis ies = os J =i 
Exon golist, AltFirst Internal AltLast |Last FIAF | L/AL FIAF | L/AL F/AF | L/AL 
F AF | AL L 
D Zebrafish Melanoma E Human Melanoma F Human Melanoma 
3’ RNA Sequencing Cells 3’ RNA Seq Cells 3’ RNA Seq 
5 2kb__y 
AN 
CLOVER 2 I i 
l , 
1 
cpK13mut I | 
I 4 
+ —> -> w _—_o: 


log2 CDK13™Ut/CLOVER 3’ Cleavage Location 
-20 0 


log2 CDK13™Ut/EGFP 3’ Cleavage Location 


Model 1: oe . 
Impaired transcriptional elongation: 
RNAPII 


Template DNA | 


Control 


full-length mRNAs 


ptRNAs £2 ptRNAs _ 
ddPCR 
= 
fe} i=] 
= 3 fe) 
£8 an @ 
65 ee, 
OX oa aa 
ee go 
=$ 2< 
a & 
First Last First Last 


Fig. 2. CDK13 mutation results in accumulation of RNAs that prematurely 
terminate in introns. (A) Log2-fold normalized exon expression in CDK13°°°°- 
expressing (n = 5) versus EGFP-expressing (n = 4) zebrafish melanomas. 
****P < 22 x 10°! first versus last exon. (B) Log2-fold normalized exon expression 
for CDK13™* (n = 3) versus CDK13“" (n = 5) matched control patient 
melanomas. ****P < 2.2 x 10°, F/AF versus L/AL exon. (C) Log2-fold 
normalized exon expression in Cdk13‘“~ versus Cdk13°“~;Cdk13+ (control) 
mESCs. 48 and 72 hours, hours of Cdk13 depletion (n = 4 for each). 

***P < 2.2 x 10°! F/AF versus L/AL exon. (D and E) Log2-fold CDK13"™'/ 
control cleavage site utilization (3' seq) in UTRs, introns, and exons in 
zebrafish melanomas (D) and human melanoma cells (E). (F) Integrative 
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1.5x the interquartile range (IQR). P values in (A) to (C) are from two-sided 
Wilcoxon rank-sum tests. 
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alongside paired standard RNA-seq (Fig. 3A, 
TT-seq schematic) (44). If ptRNAs are gener- 
ated through impaired transcriptional elonga- 
tion, then there should be less 4-thiouridine 
(4sU) incorporation at the 3’ ends of genes 
(Fig. 3A, model 1). If ptRNAs accumulate 
through a posttranscriptional process such as 
impaired degradation, then 4sU incorporation 
should be intact at the 3’ ends of genes (Fig. 
3A, model 2). RNA-seq differential exon analy- 
sis confirmed increased 5’ coverage in CDK13"™" 
cells consistent with accumulated ptRNAs (fig. 
S3A). TT-seq metagenes showed no evidence 
of an increase in premature termination by 
RNAPII; instead, we observed a slight increase 
in RNA synthesis across the gene body in 
CDK13™" cells (Fig. 3B). In both TT-seq and 
RNA-seq, the metagene profiles indicate intact 
cleavage and polyadenylation in CDK13™™" 
and CLOVER control cells (fig. S3, B and C). 
Nascent RNA production upstream of the 
cleavage and polyadenylation site is intact in 
CDK13™"-expressing human melanoma cells 
further demonstrating that elongation is func- 
tional (fig. S3B). These data show that CDK13™™" 
cells maintain productive RNAPII elongation 
and that ptRNAs accumulate through a post- 
transcriptional mechanism such as loss of 
ptRNA degradation. 

To measure production and steady-state 
levels of ptRNAs at intronic polyadenylation 
(IPA) sites in CDK13™" cells, TT-seq and RNA- 
seq read coverage were quantified across IPA 
sites regulated by CDK13 as measured from 
3’ seq from Fig. 2E (g < 0.1). To allow pure 
intronic read quantification, IPA sites were 
included if they were >400 base pairs from 
the nearest exon (n = 134). The upstream/ 
downstream coverage ratio for IPA sites from 
CDK13™" versus control human melanoma 
cells was compared with the median ratio in the 
RNA-seq and the TT-seq (Fig. 3C). The upstream/ 
downstream coverage ratio in the TT-seq 
(nascent RNA) was unaffected in CDK13™™ 
cells (Fig. 3D, TT-seq, left), showing that there 
is similar low-level intronic transcriptional ter- 
mination. By contrast, the RNA-seq upstream/ 
downstream coverage ratio was significantly 
increased in CDK13™" cells (P = 0.01) (Fig. 3D, 
RNA-seq, right; and fig. S3, D and E), reflect- 
ing higher steady-state RNA levels upstream 
of IPA sites. These data show that CDK13™" 
cells stabilize ptRNAs posttranscriptionally. 

To further investigate the transcriptional 
effects of chronic CDK13™" expression, we 
performed chromatin immunoprecipitation 
(IP) sequencing with antibodies to RNAPII 
(8WG hypophosphorylated and Ser2 CTD) in 
CDK13™" and control human melanoma cells 
and zebrafish melanomas. In both systems, 
increased RNAPII occupancy was observed 
in gene bodies in the CDK13™" condition (fig. 
83, F to J). These data support the ddPCR and 
TT-seq results showing that transcriptional 
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elongation is intact and that ptRNA accumu- 
lation occurs in CDK13™ through a highly 
conserved posttranscriptional mechanism. 
ptRNAs have been reported to be exported 
(23, 45) and are predicted to be translated (27). 
We used tandem mass spectrometry (MS) to 
analyze global protein expression in mela- 
nomas isolated from age-matched Triples 
zebrafish with melanocyte-specific expres- 
sion of CDK13™" (n = 3) or EGFP (n = 3). Dif- 
ferential protein expression analysis revealed 
that 174 proteins were up-regulated, including 
32 proteins related to vesicle-mediated trans- 
port with many implicated in lysosomes/ 
autophagy (biologic complete, g = 2.3 x 107) 
(fig. S3K). CDK13™" melanomas have increased 
lysosomal and autophagic protein expression, 
suggesting that CDK13™"' cells have proteo- 
mic stress that could be caused by translation 
of truncated RNAs into truncated proteins. 

If ptRNAs are translated, then there should 
be increased peptide levels at the beginning of 
proteins and fewer at the end of proteins (Fig. 
3E). To test whether ptRNAs are translated, 
individual peptide measurements filtered for 
changes between CDK13™" and control con- 
ditions were analyzed (P < 0.1). Log2-fold 
change CDK13™" versus EGFP control pep- 
tide measurements were binned and plotted 
as a percentage of canonical protein sequence 
length. This analysis revealed that early (N- 
terminal) protein peptide measurements were 
increased in CDK13™™ melanomas (Fig. 3F). 
The negative slope is consistent with an in- 
crease in short proteins arising from the trans- 
lation of prematurely terminated transcripts, 
analogous to increased 5’ coverage in RNA-seq 
seen in CDK13™™ melanomas. This analysis 
identified 263 proteins with evidence of trun- 
cation. Any protein with more than three pep- 
tide measurements was tested for evidence of 
truncation, which identified 103 truncated pro- 
teins, including 56 newly identified proteins 
(fig. S3L). Several identified truncated proteins 
are predicted to lose carboxyterminal enzy- 
matic domains, including Ik, Crk, Ikbkb. The 
melanoma tumor suppressor Idh2 was also 
affected (Fig. 3G and fig. S3M) (46-48). These 
data show that, rather than being degraded by 
the nuclear exosome, ptRNAs are being ex- 
ported and translated into truncated proteins 
in CDK13™-expressing cells. 

We wondered whether intronic sequences 
at the end of ptRNAs are translated (schema- 
tized in Fig. 3E). ptRNA translation products 
were predicted by assuming in-frame transla- 
tion from the upstream exon until the first 
stop codon before ptRNA termination as de- 
termined from 3’ seq. ptRNA translation prod- 
ucts were added to the canonical proteome for 
tandem MS data search. As predicted, intro- 
nic peptides were enriched specifically in the 
CDK13™" zebrafish melanomas (Fig. 3H). The 
abundance of some intronic peptides ap- 
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proached that of medium- to lowly expressed 
full-length proteins (fig. S3, N to P). To verify 
truncated protein expression, we selected two 
predicted truncated proteins with available 
N-terminal antibodies for immunoblotting. 
We identified truncated protein products from 
CRBN and CDK13 itself from CDK13-mutant 
expressing versus control human melanoma 
cells (fig. S3, Q and R). These data indicate that 
some ptRNAs from the CDK13 mutant melano- 
mas are translated, including the intronic se- 
quence preceding IPA sites, which could result 
in the production of tumor-specific neoantigens. 


Mutant CDK13 disrupts the polyA RNA exosome 


To elucidate how CDK13™" expression re- 
sults in ptRNA accumulation through a post- 
transcriptional mechanism, CDK13™" was 
tagged, transiently expressed, and immuno- 
precipitated ([Ped) from human melanoma 
cell nuclear extracts, and co-IPed proteins were 
identified by MS (Fig. 4A). CDK13“" IP-MS 
identified 37 proteins enriched in the CDK13“" 
IP versus control (Fig. 4B). The most enriched 
ontology among these 37 proteins was “mRNA 
3’ end processing” (reactome, g = 7.15 x 10°") 
(49). CDK13 bound CCNTI1 (Fig. 4, B and ©), 
again implicating CCNT1 as an important 
binding partner for CDK13 in melanoma as 
our functional data suggested (Fig. 1, K and L, 
and fig. SIU). The canonical cyclin-binding 
partner of CDK13, CCNK, was detected at levels 
above background but below thresholding. No 
other cyclins were detected above background. 
Native CDK13 IP identified CCNT1 and CCNK 
by immunoblot (fig. S4A). This unbiased ap- 
proach suggests that CDK13 binds CCNT1 in 
addition to CCNK in melanoma. 

Of the enriched proteins in the CDK13“" 
IP (Fig. 4C), both PABPNI and ZC3H14 in- 
teract with the polyA tail eXosome Target- 
ing (PAXT) complex (22), which is responsible 
for targeting ptRNAs for degradation (23). 
CDK13 binding to ZC3H14 was verified by 
immunoblot (fig. S4B). Because ptRNAs ac- 
cumulate in CDK13™ melanoma through a 
posttranscriptional mechanism and are de- 
graded by the PAXT complex (50), we hypoth- 
esized that CDK13 normally works to activate 
PAXT. We hypothesized that loss of CDK13 
kinase activity would fail to activate PAXT, 
leading to accumulated ptRNAs in melanoma. 

Because PABPNI1 and ZC3H14 are geneti- 
cally antagonistic (57, 52) and because we ob- 
served more ZC3H14 in our CDK13“" IP, we 
chose to characterize ZC3H14 phosphorylation 
and binding partners in the presence and ab- 
sence of CDK13 kinase activity. The nuclear 
isoform of ZC3H14 and a control protein were 
tagged, transiently expressed, and IPed from 
CDK13™" and CDK13™" human melanoma cells 
(fig. S4C, white, black, and blue). [Ped ZC3H14 
from CDK13™" cells had four phosphorylation 
sites, whereas in CDK13™" cells, ZC3H14 lost 
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Fig. 3. PtRNAs accumulate posttranscriptionally and are translated - 
including intronic sequences. (A) Models for ptRNA accumulation as measured by 
TT-seq. TSS, transcriptional start site; TES, transcriptional end site. (B) Metagene 
plots are shown for TT-seq coverage over exons within nonoverlapping transcripts 
from expressed genes (N = 7452) from CDK13™* versus CLOVER-expressing human 
melanoma cells. TSS to TES regions are shown, flanked + 1-kb genomic sequence. 
(C) Schematic for analysis of TT-seq and RNA-seq coverage around intronic 
polyadenylation sites. (D) Box plots of the ratio of upstream [-300 to —1 nucleotides 
(nt)] to downstream (+1 to +300 nt) read coverage at the 3’ cleavage locations 

as schematized in (C). P values are from Wilcoxon signed-rank test compared 

with the median ratio from all samples. (E) Schema of tandem mass spectrometry 
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Fig. 4. Mutant CDK13 
disrupts the polyA RNA 
exosome. (A) CDK13 
IP-MS schema. (B) Heat- 
map of average total 
peptides from anti-V5 
IP-MS of CDK13 WT-V5 
(n = 3) or CLOVER- 

V5 (n = 2). (C) Log2-fold 
change CDK13""/control 
total peptides by -log 

P value (two-tailed t test). 
Upper right quadrant 
indicates proteins with a 
-log P value < 0.05 and a 
log-fold change enrich- 
ment over control >3.3. 
(D) Graphic depicting 
ZC3H14 phosphorylation 
sites identified from 
ZC3H14 |Ped from either 
CDK13""-expressing or 
CDK13™*-expressing 
human melanoma cells. 
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phosphorylation only at S475 (Fig. 4D and fig. S4, 
D and E). Thus, CDK13 kinase activity is re- 
quired for ZC3H14 S475 phosphorylation. In 
vitro kinase reactions [radioactive kinase as- 
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say in Fig. 4E and labeled ATP immunoblot 
(53) in fig. S4F] demonstrated that CDK13/ 
CCNT1 was able to phosphorylate full-length 
WT ZC3H14 but not ZC3H14 S475A. 
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We looked at ZC3H14’s binding partners in 
the presence and absence of CDK13 kinase ac- 
tivity. When ZC3H14 was IPed from CDK13™" 
cells, fewer binding partners were identified 
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(ZC3H14 peptides were not statistically differ- 
ent) (figs. S4G, rows 2 and 3, and S4H, left 
columns). Total IPed peptides were normal- 
ized to total bait (ZC3H14) peptides. Differen- 
tial binding was calculated between CDK13“" 
and CDK13™" conditions (t test P < 0.05), 
which identified 18 proteins that required in- 
tact CDK13 kinase activity to promote binding 
to ZC3H14. The three most abundant proteins 
that required CDK13 kinase activity to promote 
binding to ZC3H14 were THOC2, ZFC3H1, and 
MTR4 (Fig. 4F, black versus blue bars) (THOC2 
q < 0.0001, ZFC3H1 g < 0.0001, and MTR4 q = 
0.018). THOC2 functions in RNA export and 
binds ZC3H14 (54), ZFC3H1 is a linker between 
the PAXT and the nuclear RNA degradation 
machinery, and MTR4 is a helicase required 
for nuclear RNA degradation (22). We also ob- 
served that ZC3H14 bound PAXT proteins 
PABPNI and ZC3H18; however, binding of 
these proteins to ZC3H14 was independent of 
CDK13 kinase activity (Fig. 4F). These data 
show that ZC3H14 binds to multiple PAXT 
members, and that CDK13 kinase activity pro- 
motes ZC3H14 binding to THOC2 and two key 
PAXT components, ZFC3H1 and MTR4. 

To test whether the ZC3H14 S475 phospho- 
rylation was sufficient to recruit PAXT bind- 
ing, phosphomimetic ZC3H14**”” was tagged, 
transiently expressed, and IPed from cells lack- 
ing CDK13 kinase activity. ZC3H14°*”? was 
sufficient to rescue binding of PAXT compo- 
nents to ZC3H14 (Fig. 4F, red bars, and fig. S41, 
fourth row, and I and J), even in cells lack- 
ing CDK13 kinase activity. To test whether 
ZC3H14 S475 was necessary for PAXT recruit- 
ment, nonphosphorylatable ZC3H14°*™ was 
IPed from CDK13™" cells. ZC3H14°*""" failed 
to recruit PAXT components (Fig. 4F, yellow 
bars, and fig. S4G, last row, and K and L), even 
in cells with intact CDK13 kinase activity. The 
amount of ZC3H14 IPed was statistically un- 
changed (fig. S4H, right columns). Together, 
these data show that ZC3H14 S475 phosphoryl- 
ation promotes PAXT recruitment to ZC3H14. 

To determine whether ZC3H14 S475 phos- 
phorylation is also necessary and sufficient to 
activate ptRNA degradation, a two-pronged ap- 
proach was undertaken. First, short interfering 
RNAs were used to decrease levels of ZFC3H1 
(n = 3) or ZC3H14 (n = 3), and differential RNA 
expression was assessed using 3’ seq and RNA- 
seq (fig. S4M) compared with a scrambled 
control (7 = 3). Second, stable human mela- 
noma cell lines expressing nonphosphorylat- 
able ZC3H145*”* (n = 2), phosphomimetic 
ZC3H14 (n = 3), or a control protein (n = 3) 
were subjected to RNA-seq (fig. S4N). ZC3H145"°F 
was used because we were unable to make 
stable lines with ZC3H14*””. 3' sequencing 
from siZFC3H1, siZC3H14, siControl, and prior 
CDK13™" and control samples were used to 
build an expression map for ptRNAs and dom- 
inant isoforms in human melanoma cells. This 
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ptRNA isoform expression map was used to 
calculate differential RNA expression using 
DEXseq in all conditions compared with ap- 
propriate controls. A boxplot of significantly 
changed RNAs (q < 0.1) from ZFC3H1 versus 
control knockdown in human melanoma cells 
demonstrated an increase in ptRNA isoforms, 
but minimally changed last exons and con- 
stitutive internal exons as expected (Fig. 4G, 
red). siZC3H14 visualized in the same man- 
ner showed very few significant expression 
changes (fig. S40), which may be caused by 
the activity of residual protein or redun- 
dancy with another protein. Expression of 
Zc3H14™ versus CLOVER (q < 0.1) showed a 
modest increase in ptRNA expression, whereas 
the last and constitutive internal exons were 
minimally changed (Fig. 4G, yellow). Expres- 
sion of ZC3H14™”" versus CLOVER (q < 0.1) 
caused a decrease in ptRNAs while not affect- 
ing last or internal exons (Fig. 4G, blue) (also 
see medians and Wilcoxon rank-sum compar- 
ing ptRNAs, internal exons, and last exons 
in table $5). The 7P53 ptRNA, which is up- 
regulated in CDK13™" cells, was up-regulated 
in ZFC3H1 knockdown, and upon nonphos- 
phorylatable ZC3H14**"“ expression, consis- 
tent with loss of PAXT nuclear RNA degradation. 
By contrast, phosphomimetic ZC3H14547°" 
caused lower expression of the TP53 ptRNA, 
consistent with hyperactivation of PAXT RNA 
degradation (Fig. 4H). Together, these data 
show that ZC3H14 S475 phosphorylation is 
necessary and sufficient to activate PAXT to 
degrade ptRNAs. 

To test whether expression of ZC3H14 phos- 
phomimetic and nonphosphorylatable mutants 
affected human melanoma cell growth rate, 
doubling time was measured. ZC3H14°*“ 
expressing-cells proliferated at a similar rate to 
CDK13™"-expressing cells, whereas ZC3H14°"”"- 
expressing cells had an increased doubling 
time (i.e., grew more slowly) (Fig. 41, columns 
1, 2, and 4). These data are consistent with the 
hypothesis that cells with higher ptRNA levels 
have an increased growth rate, whereas cells 
with lower ptRNA expression have a slower 
growth rate. To determine whether effects of 
CDK13™" and ZC3H14™” expression function 
in the same or a parallel ptRNA surveillance 
pathway, ZC3H14*” phosphomimetic and 
nonphosphorylatable mutants were expressed 
in CDK13™" human melanoma cells. We were 
unable to recover CDK13"™ cells that expressed 
ZC3H14***, indicating that CDK13™" cells 
are addicted to ptRNA expression or that car- 
rying both mutations is deleterious to cell 
viability. CDK13™";ZC3H1454" cells had a 
similar growth rate to either mutant alone 
(Fig. 41, column 3) and affected the same 
ptRNAs to a similar magnitude as CDK13™™ 
(fig. S4, P and Q). Our data demonstrate that 
mutant CDK13 expression caused inefficient 
PAXT recruitment and ptRNA stabilization, 
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supporting an oncogenic role for ptRNAs 
(fig. S4R). 


ptRNA accumulation is oncogenic 


To determine whether CDK13™ causes accu- 
mulation of ptRNAs in other cancers, publicly 
available RNA-seq from patient tumor sam- 
ples from multiple tumor types were exam- 
ined (bladder, colorectal, lung, melanoma, 
lung, and uterus). RNA-seq from tumors with 
CDK13 kinase-domain or nonsense mutations 
were compared against CDK13" tumors with 
matched tumor and patient characteristics 
(table S6) using the PAXT-regulated ptRNA iso- 
form map developed for Fig. 4. PAXT-regulated 
ptRNA isoforms are significantly up-regulated 
in CDK13™" cancers compared with internal 
exons (Fig. 5A; P < 2.2 x 10°"). Last exons are 
also up-regulated relative to internal exons, 
suggesting a general increase in the stability 
of polyadenylated RNAs in CDK13 mutant tu- 
mors (Fig. 5A), which may have been revealed 
because of the high RNA degradation levels 
typically observed in patient samples. The 
CDK13/PAXT target TP53 ptRNA was more 
highly expressed in RNA-seq coverage profiles of 
CDK13 mutant tumors compared with matched 
CDK13™" controls (Fig. 5B). These data show 
that patient tumors with CDK13 mutations from 
many cancer types exhibit an accumulation of 
PAXT-sensitive ptRNAs. 

To test whether ptRNA expression is onco- 
genic, human 7P53 ptRNA or SUV39HI ptRNA 
versus control EGFP were expressed in mela- 
nocytes in the Triples zebrafish melanoma 
model. Human 7P53 ptRNA expression caused 
increased black patches at 7 weeks, consistent 
with early melanoma (P < 0.0001, two-sided 
chi-square test) (Fig. 5, C and D, and fig. S5A) 
and expedited melanoma onset (P = 0.0391) 
(Fig. 5E). Expression of human SUV39H1 
ptRNA also caused increased black patches 
at 7 weeks (P = 0.0051) (Fig. 5, F and G, and 
fig. S5B) and expedited melanoma onset (P < 
0001) (Fig. 5H). Quantitative PCR confirmed 
ptRNA expression (fig. S5, C and D). Because 
the TP53 ptRNA is oncogenic and derives from 
the 7TP53 locus, we performed p53 immuno- 
blots and found that CDK13™" human mela- 
noma cells have intact p53 full-length protein 
(fig. SSE). These data show that human ptRNAs 
are sufficient to expedite melanoma onset in 
zebrafish, and more generally that ptRNAs 
can be oncogenic. 

Because the loss of PAXT activity may rep- 
resent a more widespread mechanism of on- 
cogenesis, we probed public databases and 
found that CDK13-regulated PAXT members 
are deleted or mutated in 17% of melanomas 
(Table 2). Analysis with OncodriveFM in 
melanoma samples found that 7ZC3H18 is a 
marginally significant genome-wide driver 
gene (q = 0.057; P = 0.0074), whereas ZFC3H1 
was not found to be a likely driver (q = 0.22; 
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Fig. 5. ptRNA accumulation is oncogenic. (A) ptRNA quantification from 
somatic CDK13™* (n = 14) compared with matched CDK13™" (n = 14) cancers 
(many types). ****P < 2.2 x 10 (two-sided Wilcoxon rank-sum test). The 
black horizontal line indicates the median and whiskers extend to 1.5x the 
IQR. (B) IGV RNA-seq coverage plot of PAXT-target TP53 ptRNA from sample 
subset from (A). (©) TP53 ptRNA in TP53 locus. IPA = intronic polyadenyl- 
ation site. CDS = coding sequence. (D and E) Triples melanoma model 
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800 953aa 


zebrafish with melanocyte-specific expression of EGFP versus human TP53 
ptRNA. (D) 7-week photos. (E) Percentage melanoma-free survival (log-rank). n = 
number of zebrafish. (F) SUV39HI1 ptRNA in SUV39H1 locus. (G and H) Triples 
melanoma model zebrafish with melanocyte-specific expression of EGFP 
versus human SUV39HI1 ptRNA. (G) 7-week photos. (H) Percentage melanoma-free 
survival (log-rank). (I and J) Lollipop plots of ZFC3HI (I) and ZC3H18 (J) mutations 
in nonredundant publicly available sequencing data from all cancers. 
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Table 2. CDK13-regulated PAXT subunits 
are mutated or deleted in melanoma 
TCGA patients. 


Mutation* 


7% (n = 19) 


ae ty 
Ce) 
MIRA 206 (n = 6) 


*Percentages of cases out of 287. 


P = 0.051) (fig. S5, F and G). The PAXT ad- 
aptor protein ZFC3H1 is recurrently mutated 
(K385Nfs*9) in 23 different patient tumors 
representing a wide variety of cancers (Fig. 51) 
and has nonrecurrent mutations in 5 to 11% 
of multiple cancer types, including nonmela- 
noma skin cancer, endometrial cancer, colon 
adenocarcinoma, and small-cell lung cancer 
(fig. S5H). ZC3H18 is recurrently mutated 
(R680Q/Gfs*) in 48 individual samples from a 
broad variety of cancers (Fig. 5J) and has non- 
recurrent mutations in many cancers, includ- 
ing nonmelanoma skin cancers, vaginal cancers, 
and endometrial cancers (fig. S51). Together, 
these data suggest that nuclear RNA quality 
control through the PAXT complex is critical 
to avoid the accumulation of ptRNAs and their 
aberrant protein products in multiple cancer 
types, and support deficient nuclear RNA sur- 
veillance as a general oncogenic mechanism. 


Discussion 


Our data show that CDK73 has properties con- 
sistent with a tumor suppressor in which mu- 
tations lead to accumulation of prematurely 
terminated transcripts by preventing their 
normal degradation. CDK13 normally phos- 
phorylates ZC3H14 S475, which is necessary 
and sufficient to promote PAXT recruitment 
and ptRNA degradation. When CDK13 is mu- 
tated, failure to recruit the PAXT complex 
results in the accumulation of truncated onco- 
genic transcripts, which are exported to the 
cytoplasm and translated into truncated pro- 
teins. The expression of ptRNAs observed in 
CDK13 mutant melanoma can cause more ag- 
gressive cancer. Our data show that mutant 
CDK13 disrupts a deeply conserved pathway to 
cause ptRNA accumulation, and that ptRNA 
accumulation is oncogenic. 

In the CDK13-associated developmental dis- 
order, de novo CDK13 mutations are thought 
to work through a dominant-negative mech- 
anism. Homozygous CDK13 loss of function 
causes lethal heart malformations in mice (55), 
and likely causes embryonic lethality in hu- 
mans because homozygous deletion is not ob- 
served in phenotypically normal people (n = 
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54,980) (56) or in patients with a developmen- 
tal disorder (7 = 43,173) (57). Heterozygous 
CDK13 deletions have been identified in nor- 
mal humans in two genomic aggregate data- 
bases (gnomAD, Database of Genomic Variants) 
(19). By contrast, heterozygous CDK13 kinase 
domain mutations cause a syndromic devel- 
opmental disorder that affects heart develop- 
ment as well as other neural crest-derived 
tissues. Our work raises the possibility that in- 
dividuals with a CDK13-related disorder could 
be predisposed to developing cancer, which 
is supported by a recent report of a 9-year-old 
patient with the CDK73-related disorder devel- 
oping leukemia (58). The strong mutant CDK13- 
associated disorder phenotypes are more severe 
than the loss of one CDK73 allele, which sup- 
ports that CDK13 kinase-domain mutations 
have dominant-negative activity. 

Our initial insight for this work came from 
the observation of that CDK13 mutations in 
melanoma were enriched in the kinase domain, 
which suggested a genetic function beyond 
haploinsufficiency. Our data support a model 
in which CDK13 heterozygous loss of function 
is selected for (50% activity loss), whereas 
heterozygous kinase domain mutations are 
additionally selected for. This is because non- 
phosphorylatable CDK13 acts in a dominant- 
negative manner, that is, it blocks CDK13“7 
from activating its targets (>50% activity loss). 
Loss of both alleles is infrequently observed, 
possibly because of reduced cellular fitness, as 
was seen for a cancer cell line (59). We sup- 
posed that CDK13 kinase-domain mutations 
cause a more severe PAXT deficiency and thus 
are more selected for than mutations that 
cause CDK13 haploinsufficiency. Our zebrafish 
genetics and the lack of CDK13 mutations in 
melanoma cyclin-binding residues suggested 
that CDK13™” dominant-negative activity re- 
quires cyclin binding. Expression of CDK13™™" 
in a CDK13“" background recapitulated the 
RNA cancer phenotypes observed in melanoma 
patients with heterozygous CDK13 mutations. 
In multiple CDK13 experimental systems, we 
found that mutant CDK13 causes the same 
RNA and cancer phenotype, namely accumu- 
lated ptRNAs and more aggressive mela- 
noma. We hypothesized that only by modeling 
the human genetics and chronically expres- 
sing kinase-mutant CDK13 in the presence 
of CDK13“" would we be able to uncover the 
role of CDK13 in RNA surveillance. Our work 
shows that CDK13’s function in ptRNA degra- 
dation is used broadly. 

ptRNAs in CDK13™" cells are exported and 
translated into truncated proteins. Transcripts 
that terminate through intronic polyadenyla- 
tion sites are predicted to be exported and 
translated if they escape nuclear decay (2/, 45). 
We are not aware of any studies documenting 
proteome-wide measurements of ptRNA trans- 
lation, including intronic sequences. Large- 


21 April 2023 


scale truncated protein expression would be 
predicted to cause protein stress in cells, which 
is consistent with the observed up-regulation 
of autophagic and lysosomal proteins in CDK13 
mutated melanomas. Of potentially great im- 
portance for predicting immunotherapy re- 
sponses in patients, we found evidence that 
the truncated proteins can end in translated 
intronic sequences. These sequences have the 
potential to produce neoantigens that would 
be absent from CDK13™" cells and thus would 
be predicted to elicit a highly tumor-specific 
immune response. 

ptRNAs derived from intronic polyadenyl- 
ation sites are enriched in cancers (60). We 
hypothesize that different cancers will accumu- 
late different ptRNAs depending on the RNAs 
expressed and the mechanism of ptRNA accu- 
mulation. Loss-of-function mutations in CDK12 
in metastatic castration-resistant prostate can- 
cer and serous ovarian carcinomas cause in- 
creased production of truncated RNAs in DNA 
repair genes (7, 8), contributing to CDK12’s 
tumor-suppressive properties. The IPA sites 
up-regulated by CDK12 loss appear to be most- 
ly but not fully distinct from those seen when 
CDK13 is lost (fig. S2E and data S1), consistent 
with CDK12 and CDK13 having divergent mech- 
anisms of ptRNA regulation (Table 3). 

Widespread ptRNA formation has also been 
reported to take place when the U1 small nu- 
clear RNA is prevented from pairing with 5’ 
splice sites in a phenomenon called “telescript- 
ing” (67). The U1 spliceosomal RNA is recur- 
rently mutated in multiple cancers at base 3, 
which pairs with the 5’ splice site (62). U1 
mutations could cause increased generation 
of ptRNAs and could thus represent a third 
mechanism by which ptRNA expression pro- 
motes oncogenesis. We envision mechanisms 
that govern 3’ end formation, such as telescript- 
ing, and those that govern ptRNA stability, 
such as CDK13, as being complementary ptRNA 
control mechanisms. It will be important to 
investigate the expression of ptRNAs and 
other aberrant RNAs in different cancers with 
different driver mutations because aberrant 
RNA accumulation could represent a final com- 
mon pathway. 

Germline mutations in the PAXT-associated 
proteins ZC3H14 and THOC2 cause a neuro- 
developmental disorder (5/, 63), suggesting 
that the regulation of ptRNAs in the nucleus 
has broad implications in development and 
disease. We propose that loss of nuclear RNA 
surveillance through CDK13 and PAXT is a 
general oncogenic mechanism. PAXT mem- 
bers regulated by CDK13 are mutated in 17% of 
melanomas in addition to the 3.9% of mela- 
nomas harboring CDK13 mutations, raising 
the possibility that loss of nuclear surveillance 
is a contributing factor in up to 21% of mela- 
noma. We showed that patients with other 
malignancies harboring CDK13 mutations also 
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es 
Table 3. CDK13 has a related but distinct biologic role and mechanism from CDK12. 


Mutated in cancer 


CDK13 


Melanoma (3.9%), cancer cell lines (8%) 


CDK12 
Prostate, ovarian (64, 65) 


Mutated in developmental disorders 


De novo heterozygous kinase domain mutations 


and truncating mutations cause developmental 


disorders that affect neural crest-derived 


tissues (congenital heart defects, dysmorphic 
facial features, and intellectual development 


disorder) (CHDFIDD) (15-19) 


Not detected 


RNA mechanism 


accumulate ptRNAs. We also found that re- 
current mutations in the key PAXT members 
ZFC3H1 and ZC3H18 occur in many cancers, 
implying that these mutations may potentially 
be selected for. Further work will be required 
to support this hypothesis. The finding that 
mutant CDK73 is oncogenic through deficient 
RNA surveillance and that recurrent mutations 
occur in multiple PAXT components sug- 
gests a broad, previously unrecognized tumor- 
suppressive role for nuclear RNA surveillance. 


Methods summary 


Human patient data analysis of CDK13 was 
done by downloading publicly available mela- 
noma DNA-sequencing data, and then these 
data were used for OncodriveFM analysis and 
kinase domain enrichment analysis in mel- 
anoma patients. TCGA melanoma data were 
used for patient survival analyses and clo- 
nality analyses. Publicly available DNA tumor 
sequence data from all cancers were used to 
elucidate the kinase domain enrichment of 
mutations in cancers other than melanoma. 
In vitro kinase assays were executed with puri- 
fied CDK13 activated by a cyclin on purified 
protein substrates. Zebrafish melanocyte and 
melanoma modeling were done using a rapid 
FO melanocyte-specific modeling system in 
Triples (p53 ;mitfa:BRAF ©" -mitfa -_) zebra- 
fish. Stable human melanoma cell lines were 
made with lentiviral transduction in A375 hu- 
man melanoma cells. 5’ to 3’ RNA-seq analysis 
measured differential exon expression in four 
systems: (i) TCGA melanoma publicly available 
CDK13™"' melanomas, (ii) zebrafish CDK13™"- 
expressing melanomas, (iii) mESCs depleted 
for CDK13, and (iv) CDK13"™"-expressing human 
melanoma cells. 3’ sequencing was executed 
(Lexogen REV) and RNA cleavage site location 
was measured in CDK13™" versus control 
zebrafish melanoma and human melanoma 
cell systems. ddPCR measured first and last 
exon expression on four affected genes and 
two control genes in CDK13™" and control hu- 
man melanoma cells. Nascent RNA-seq (TT-seq) 
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Required to activate PAXT degradation of ptRNAs 


mut 


was done using a 5-min 4sU pulse in CDK13 
and control human melanoma cells. Chro- 
matin immunoprecipitation was done using 
published antibodies to hypophosphorylated 
RNAPII (initiating RNAPII) and RNAPII Ser2 
CTD phosphorylated (elongating RNAPII) in 
the zebrafish and human melanoma cell 
systems. Tandem mass spectrometry was done 
on CDK13™" versus control zebrafish melano- 
mas. Peptide measurements were plotted along 
normalized percent protein length. Translation 
of intronic sequence was predicted assuming in- 
frame translation from the upstream exon. 
IP-MS was performed by IPing CDK13™" in 
A375 cells and filtering for repeatability and 
enrichment over control IP. ZC3H14 IP-MS 
was performed in CDK13“" and CDK13™" 
human melanoma cells, and ZC3H14 phospho- 
rylation and binding partners were assessed in 
both conditions. ZC3H14 $475 phosphomi- 
metic protein was [Ped from CDK13™" cells, 
and ZC3H14 unphosphorylatable S475 was 
IPed from CDK13™" cells. Proliferation and 
ptRNA levels were quantified from stable hu- 
man melanoma cell lines expressing zZc3Hi4” 
mutants with or without CDK13™". TCGA 
RNA-seq data were used for ptRNA isoform 
quantification in other, nonmelanoma CDK13™ 
cancers. Human ptRNAs were expressed in the 
above zebrafish system. Recurrent mutations 
in other PAXT members were identified from 
publicly available DNA-sequencing data from 
all cancers. Please see the full materials and 
methods in the supplementary materials for 
more detailed information. 
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INTRODUCTION: Spinal muscular atrophy (SMA) 
is the leading genetic cause of infant mortality. 
SMA results from survival motor neuron (SMN) 
protein insufficiency after homozygous loss of 
the SMN/I gene. A closely related gene, SMN2, 
differs from SMN1 by a C6T substitution (ie., a 
C-to-T transition at position 6) in exon 7 that re- 
sults in a truncated SMNA7 protein that fails to 
fully compensate for SMN1 loss. Two recently 
approved SMA drugs partially restore SMN pro- 
tein levels through splice isoform switching. A 
third drug uses viral gene complementation to 
restore SMN levels. Although up-regulation of 
SMN levels by these approved drugs effectively 
treats SMA, current therapies circumvent en- 
dogenous regulation of SMN, do not fully restore 
SMN levels, and either require repeated dosing 
or may fade over time. A one-time, permanent 
treatment that restores endogenous gene expres- 
sion and preserves native SMN regulation may ad- 
dress these limitations of existing SMA therapies. 


RATIONALE: Genome editing of SMN2, which 
is present in all SMA patients, could enable a 
one-time treatment for SMA that restores normal 
SMN transcript and protein levels while pre- 


Base editing of SMN2 rescues A 

SMA in mice. (A) A customized SMN2 a 
ABE converts insufficient 
SMN2 genes into healthy SMN1 
genes to produce full-length 
SMN protein. (B) Dual-AAV9- 
mediated delivery of ABE and 
green fluorescent protein (GFP) 
into SMA neonates. (C) In vivo 
conversion of SMN2 C6T in the 
central nervous system of 
treated animals. (D) Motor unit 
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serving their endogenous regulatory mecha- 
nisms. We developed one-time genome editing 
approaches targeting endogenous SMN2 that 
restore SMN protein abundance to normal levels 
and rescue disease phenotypes in cell and mouse 
models of SMA. We tested 79 base editing and 
nuclease strategies that modify five posttran- 
scriptional and posttranslational regulatory re- 
gions in SMN2 to increase SMN protein levels. 


RESULTS: Each of the SMNV2 nuclease and base 
editing strategies tested durably increased 
SMN protein levels between 9- and 50-fold. 
Base editing efficiently converted SMN2 to 
SMN1 genes and, unlike nuclease editing strat- 
egies or current SMA drugs, fully restored 
SMN transcript and protein levels to those of 
wild-type cells (~40-fold increase) with minimal 
off-target editing across the genome and tran- 
scriptome. Intracerebroventricular injection 
of adeno-associated virus serotype 9 encoding 
an adenine base editor (AAV9-ABE) resulted in 
87% average conversion of SMN2 C6T among 
transduced cells in the central nervous system 
of A7SMA mice, improved motor function, and 
extended life span, despite A7SMA mice having 
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a much shorter window for treatment 1 cigs 
human patients (<6 days for mice versus mo. — 
to years for humans) that ends earlier than 
typical in vivo base editing time scales (weeks). 
One-time in vivo coadministration of AAV9- 
ABE with the antisense oligonucleotide drug 
nusinersen expanded the therapeutic window 
for gene correction, further improving the life 
span of AAV9-ABE-treated animals to an aver- 
age of 111 days, compared with an average of 
17 days for untreated animals. 


CONCLUSION: Despite the incongruent time- 
line of base editing-mediated rescue for ideal 
rescue of A7SMA mice, AAV9-ABE treatment 
yielded substantial improvements in life span 
and motor function. Combination treatment 
with nusinersen enables A7SMA mouse rescue 
that resembles presymptomatic up-regulation 
of SMN levels. In humans, the therapeutic 
window is much longer. Therefore, we antic- 
ipate that AAV9-ABE may achieve presympto- 
matic rescue as a standalone therapeutic in 
SMA patients. Our study also demonstrates 
the compatibility of base editing with nusiner- 
sen, which may inform future clinical applica- 
tions. Together, these findings support the 
potential of base editing as a future one-time 
treatment for SMA that restores native SMN 
production while preserving endogenous reg- 
ulatory mechanisms of SMN expression. 
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Spinal muscular atrophy (SMA), the leading genetic cause of infant mortality, arises from survival motor 
neuron (SMN) protein insufficiency resulting from SMNI loss. Approved therapies circumvent 
endogenous SMN regulation and require repeated dosing or may wane. We describe genome editing 
of SMN2, an insufficient copy of SMN1 harboring a C6>T mutation, to permanently restore SMN 
protein levels and rescue SMA phenotypes. We used nucleases or base editors to modify five SMN2 
regulatory regions. Base editing converted SMN2 T6>C, restoring SMN protein levels to wild type. 
Adeno-associated virus serotype 9-mediated base editor delivery in A7SMA mice yielded 87% average 
T6>C conversion, improved motor function, and extended average life span, which was enhanced by 
one-time base editor and nusinersen coadministration (111 versus 17 days untreated). These findings 
demonstrate the potential of a one-time base editing treatment for SMA. 


pinal muscular atrophy (SMA) is a pro- 

gressive motor neuron disease and the 

leading genetic cause of infant mortality 

(1-3). SMA is caused by homozygous 

loss or mutation of the essential survival 
motor neuron 1 (SMNIJ) gene. One or more 
copies of the nearly identical (>99.9% sequence 
identity) SMN2 gene partially compensate for 
the loss of SMN1 (1, 4, 5). However, SMN1 and 
SMN2 differ by a silent CeG-to-T*A substitu- 
tion at nucleotide position 6 of exon 7 (C6T) 
that results in exon 7 skipping in mRNA tran- 
scripts (Fig. 1A) (6, 7). The resulting truncated 
SMNA7 protein is rapidly degraded, causing 
SMN protein insufficiency that results in loss 
of motor neurons, paralysis, and death (8-10). 
Untreated patients with the most common 
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form of SMA (type I) live a median of 6 months 
(11, 12). 

Up-regulation of SMN protein can rescue 
motor function and substantially improve the 
prognosis of SMA patients (13-75). However, 
endogenous SMN protein is subject to multi- 
ple levels of regulation that differs across tis- 
sues (16-18). Whereas SMN underexpression 
can fail to rescue SMN phenotypes, SMN over- 
expression can cause aggregation, toxicity, and 
tissue pathology (19-21). Three breakthrough 
therapeutics effectively rescue many SMA 
phenotypes and improve life span by up- 
regulating SMN protein (22). The antisense 
oligonucleotide (ASO) nusinersen (Spinraza) 
and the small-molecule risdiplam (Evrysdi) 
both promote splicing inclusion of exon 7, re- 
sulting in ~2-fold up-regulation of SMN levels, 
and have proven highly effective in the clinic 
(23, 24). However, SMN protein is reduced 
by ~ 85% in the spinal cord of untreated SMA 
patients (25-27). The partial recovery of SMN 
protein promoted by these therapeutics may 
be insufficient at early time points and in dam- 
aged tissues, potentially underlying the limited 
rescue observed in some patients (28, 29). 
Moreover, the transient nature of these ther- 
apeutics necessitates repeated administration 
of costly drugs throughout patients’ lifetimes 
(30, 3D. 

Adeno-associated virus (AAV)-mediated gene 
complementation of full-length SMN cDNA by 
onasemnogene abeparvovec-xioi (Zolgensma) 
leads to constitutive production of SMN in 
transduced cells that is not under endogenous 
control (32-34). In the spinal cord, Zolgensma 
up-regulates SMN transcript levels by ~25% 
(35), while in other tissues such as the liver 
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and dorsal root ganglia, gene complementa- 
tion may cause SMN overexpression that under 
some circumstances can cause long-term toxic- 
ity (21). We do not yet know whether SUN 
overexpression induces toxicity in patients 
treated with Zolgensma or how long AAV- 
mediated expression will persist in motor neu- 
rons in patients (36, 37). As such, a therapeutic 
modality that restores endogenous gene ex- 
pression and preserves native SMN regulation 
by a one-time permanent treatment may ad- 
dress remaining limitations of existing SMA 
therapies. Genome editing of SMN2, which 
is present in all SMA patients regardless of 
the nature of their SMN/ mutation, could en- 
able a one-time treatment for SMA that re- 
stores native SMN transcript and protein levels 
while preserving their endogenous regulatory 
mechanisms. 


Results 
Predictable and precise nuclease editing of SMN2 
ISS-N1 increases SMN protein levels 


SMN protein production from SMNI and 
SMN2 genes is constrained by transcriptional, 
transcriptomic, and posttranslational regu- 
latory sequences. We explored using Cas nu- 
cleases to create gain-of-function alleles in 
SMN2 regulatory sequences that up-regulate 
SMN levels. The inclusion of exon 7, which 
underlies SMN protein stability, is strongly 
influenced by the downstream intronic splic- 
ing silencer ISS-N1 that harbors two heter- 
ogeneous nuclear ribonucleoprotein (hnRNP) 
Al/A2 binding sites (Fig. 1A) (38). Deletions 
within and downstream of the 3’ hnRNP A1/ 
A2 binding domain improve exon 7 inclusion 
(38-41). We speculated that Cas9 nuclease- 
mediated disruption of the ISS-N1 genomic 
locus might increase exon 7 inclusion in SMN2 
splicing and thereby increase SMN protein 
levels (strategy A, Fig. 1B). 

We used inDelphi, a machine learning mod- 
el of SpCas9 nuclease editing outcomes, to 
predict insertion and deletion (indel) outcomes 
at the ISS-N1 locus that disrupt hnRNP A1/A2 
binding and improve full-length SMN splicing 
of SMN2 (Fig. 1B) (42). InDelphi identified 
10 spacer sequences predicted to induce =4- 
nucleotide (nt) deletions at ISS-N1 and loss 
of =1 nt of the 3’ hnRNP A1/A2 domain (“pre- 
dicted % precision”). We estimated editing 
efficiencies of these strategies on the basis of 
the reported protospacer adjacent motif (PAM) 
compatibility of these spacer sequences with 
SpCas9-variant nucleases (“predicted % PAM 
efficiency”) (43-46). From 19 possible nuclease 
editing strategies (Al to A19, defined as dif- 
ferent combinations of genome editing agents 
and guide RNAs), we selected nine (A2, A3, A5, 
A6, A13, Al4, A16, A17, and A19) for experi- 
mental testing. 

We cotransfected A7SMA mouse embryonic 
stem cells (mESCs)—which lack endogenous 
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protein levels after C-nuc and 
C-CBE editing or treatment with 
risdiplam, normalized to histone 
H3. (1) Distribution of SMN2 
transcript variants after C-nuc 
and C-CBE editing. Experiments 
were performed in A7SMA mESCs. 
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Smni, are homozygous for the full-length hu- 
man SMN2 gene, carry human SMNA7-cDNA 
transgenes, and harbor a Mnx1:GFP reporter 
of motor neurons (SMN2*”*; SMNA7: Smn~; 
Mnx!1:GFP) (47)—with nuclease expression plas- 
mids that carry a blasticidin-resistance cas- 
sette and single-guide RNA (sgRNA) plasmids 
that carry a hygromycin-resistance cassette. 
Both plasmids also contain Tol2 transposase se- 
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quences to enable stable transposon-mediated 
genomic integration and antibiotic selection. 
We achieved 92 + 5.6% average indel frequen- 
cies for the top four strategies targeting the 
ISS-N1 locus (A2, A3, A5, and A6) (Fig. 1B). 
To assess whether nuclease-mediated edit- 
ing of ISS-N1 improved exon 7 inclusion, we 
performed reverse transcription PCR (RT-PCR) 
of SMN2 from exons 6 to 8 and quantified 
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SMNA7 and full-length SMN products (Fig. 
1C). We found that all strategies that edited 
ISS-NI1 with high efficiency (85%) resulted in 
a significant increase in exon 7 inclusion aver- 
aging 2.2-fold relative to an unrelated sgRNA 
control (Welch’s two-tailed ¢ test, P = 0.01). The 
increase in exon 7 inclusion caused a substan- 
tial increase in SMN protein of 17-fold by A2 
and 13-fold by A6 relative to untreated controls 
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(values normalized to histone H3, Welch’s two- 
tailed ¢ test, P = 0.02; Fig. 1D and fig. S1A). 
Collectively, these results demonstrate that 
disruption of the ISS-N1 genomic locus can 
stably increase full-length SMN splicing and 
protein phenotypes of SMA. 


Predictable and precise genome editing of 
SMN2 exon 8 increases SMN protein levels 


In an alternative nuclease-mediated approach 
to up-regulate SMN levels, we disrupted post- 
translational regulatory sequences in SMN2 
to increase SMNA7 protein stability. The crit- 
ical difference between full-length SMN and 
the unstable SMNA7 protein is the substitu- 
tion of 16 amino acids encoded by exon 7 with 
EMLA, a four-residue degron encoded by exon 
8 (Fig. 1A) (8). Extending the coding sequence 
of exon 8 with five or more heterologous 
amino acids obscures SMNA7 C-terminal deg- 
radation signals. These modified SMNA7 
(SMNA7mod) protein variants have increased 
stability and rescue survival and motor pheno- 
types of severe SMA mice (48). We designed 
strategies for Cas nuclease-mediated disrup- 
tion of exon 8 to generate similar stabilized 
SMNA‘7mod proteins with therapeutic poten- 
tial (strategies B1 to B16; supplementary text 
and Fig. 1E) and observed up to a 7.0-fold in- 
crease in SMN protein levels by B11 (Welch’s 
two-tailed ¢ test, P = 0.007; Fig. 1F and fig. S1B). 

Some exon 8 editing strategies improved 
SMN protein stability more than expected 
given the observed edited genotypes (Fig. 1, E 
and F). For example, precision-edited geno- 
types were 1.9-fold higher in frequency after 
B9 editing than B1, yet SMNA7mod protein 
levels were greater in cells edited with B1 (9.1- 
fold) than B9 (5.7-fold). These data suggest 
that additional edited genotypes may improve 
SMN protein stability. Inspection of the non- 
precisely edited fraction of edited alleles re- 
vealed that B1 editing frequently induces indels 
at the exon 8 splice acceptor. Thus, we hy- 
pothesized that disrupting splicing of exon 
8 improves SMN protein stability (49). 

To test this hypothesis, we disrupted the 
canonical AG splice acceptor (SA) motif of 
exon 8 using either a nuclease or cytosine base 
editor (C-nuc or C-CBE, respectively) in ASMA 
mESCs (Fig. 1G) (45, 50) and observed 54 + 
2.3% indels from C-nuc and 89 + 2.3% cytosine 
base editing from C-CBE. Notably, C-nuc edit- 
ing resulted in a complex mixture of indel 
genotypes at the intron-exon junction that re- 
sulted in deletion of additional nucleotides 
beyond the AG motif. Both strategies signif- 
icantly increased SMN levels in A7SMA mESCs, 
similar to treatment with risdiplam (3.3-fold 
for C-nuc, 9.5-fold for C-CBE, and 9.1-fold for 
risdiplam relative to untreated; Welch’s two- 
tailed ¢ test, P < 0.05; Fig. 1H and fig. S1, D to 
G), indicating that alternative splicing at exon 
8 improves the stability of SMN2 gene products. 
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We investigated how exon 8 SA disruption 
affects SMN2 transcripts (supplementary text). 
C-CBE editing induced a minor increase in 
SMN2 mRNA that only partially explains the 
9.5-fold increase in SMN levels (fig. SIH). We 
also observed a profound shift in SMN2 splice 
products (Fig. 11). We investigated whether 
these alternative splice isoforms improve sta- 
bility of SMN proteins and found that tran- 
scripts including exon 7 were increased twofold 
by C-CBE (63 + 2.0%) and 1.6-fold by C-nuc 
(50 + 1.1%) relative to untreated cells (24 + 
1.4%). These transcripts often retain intron 7 
as in some functional transcript variants of 
SMN2 (ENST00000511812.5, fig. SII). Nota- 
bly, all transcripts that include exon 7 encode 
full-length SMN protein and can therefore 
complement loss of SMI. Thus, the substan- 
tial increase in SMN protein levels after exon 
8 SA editing predominantly arises from an 
increase in normal full-length SMN. 

Collectively, the tested SMN2 editing strat- 
egies permanently increase SMN protein levels 
up to 17-fold (strategy A2), 9.1-fold (strategy 
B1), and 9.5-fold (strategy C-CBE). As a 1.5- to 
2-fold increase in SMN protein is therapeu- 
tic for SMA patients (23, 24), these strategies 
represent promising approaches for further 
studies. 


Efficient and precise base editing of SMN2 
splice regulatory elements 


Several single-nucleotide substitutions in exon 
7 strongly regulate splicing of SMN2, including 
the C-to-T transition at position 6 (C6T) that 
differentiates SMI (C) from SMN2 (T) genes 
(Fig. 1A), and T44C, G52A, and A54G at the 3’ 
end of exon 7 (57). Using existing and newly 
developed BE-Hive predictive models of base- 
editing outcomes (supplementary text and fig. 
$2, A to E), we identified 42 strategies (com- 
binations of base editors and guide RNAs) to 
modify exon 7 splicing regulatory elements 
(SREs) (Fig. 2, A to C, and fig. S2, F and G). 
We designed 13 spacers targeting C6T using 
ABE8e (strategies D1 to D19) or targeting C6T, 
T44C, G52A, and A54G using ABE8e, ABE7.10, 
and EA-BE4 deaminases (strategies El to E23). 
We paired these spacers with 12 compatible 
SpCas9 variants on the basis of reported PAM 
preferences (“predicted % PAM efficiency”) 
(43, 46, 50). We validated these strategies in 
A7SMA mESCs and found that the BE-Hive 
models of SpCas9 base editors predicted edited 
outcomes of Cas-variant base editors with 
high accuracy [Cas9-NG (46), NRTH, NRRH, 
and NRCH (44), Pearson’s correlation coeffi- 
cient (7) = 0.810; chimeric SpyMac and iSpMac 
(45), Pearson’s r = 0.910; supplementary text 
and Fig. 2D]. 

Base editing of exon 7 SREs was highly ef- 
ficient. At 3’ SREs, we achieved 69 + 5.0% 
T44C editing by E14, 92 + 4.0% G52A editing 
by E20, and 95 + 5.1% A54G editing by E23 
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(fig. S2, F and G). We achieved nearly com- 
plete (94 to 99.5%) C6T AeT-to-GeC conversion 
by strategies targeting C6T at positions P5 (D1 
and D2), P8 (D10 and D11), and PJ0 (D18 and 
D19) within the protospacer (Fig. 2, A to C). 
The deaminase in ABE7.10 enabled up to 64: + 
2.5% conversion of T6>C (E7, fig. S2G) (52, 53). 

The frequency of edited alleles with single- 
nucleotide T6>C conversion alone (i.e., without 
any bystander edits or indels) varied substan- 
tially between the most efficient C6T edit- 
ing strategies, ranging from 82 + 1.9% for 
D10 editing to 40 + 13% for D19 editing (Fig. 
2E). Prior studies suggest that the coding se- 
quence at the SMN C terminus beyond exon 
6 does not strongly affect SMN protein func- 
tion, and it is therefore unlikely that single- 
nucleotide editing precision of C6T is imperative 
for rescue of SMA (8, 48, 54). Maximizing the 
sequence similarity of modified SMN2 genes 
to native SMNI, however, may preserve addi- 
tional regulatory interactions, including those 
not yet known. D10, the strategy with the 
highest precision and efficiency (99 + 0.7%), 
did not induce measurable indels, and its in- 
duced bystander missense nucleotide sub- 
stitutions (18 + 2.4%) have previously been 
shown to benefit inclusion of exon 7 by im- 
proved protein binding at the exonic splicing 
enhancer (fig. S2H) (55, 56). Together, these 
results establish efficient base editing strate- 
gies to convert SMN2 T6>C with high fidelity 
and few undesirable by-products. 


Base editing of SMN2 splice regulatory elements 
rescues SMN protein levels 


Next, we sought to determine whether base 
editing of exon 7 SREs results in functional 
rescue of cellular SMA phenotypes. The top 
six ABE8e editing strategies that converted 
C6T in >97% of alleles increased exon 7 in- 
clusion to 78 + 10.2% on average, up to 9.7- 
fold higher than untreated cells (87 + 1.5% by 
D10 compared with 9.0 + 6.6% in untreated; 
Welch’s two-tailed ¢ test, P < 0.002; Fig. 2F). 
These results are on par with, or exceed, 
maximum exon 7 inclusion by risdiplam or 
nusinersen treatment of A7ZSMA mESCs (89 + 
4.3% and 80 + 0.3%, respectively; Fig. 2F and 
fig. SIE) and resemble splicing ratios of SMNI 
genes (82 + 7.3% in U2OS cells) (38, 39). Base 
editing of 3’ SREs in exon 7 also improved in- 
clusion, averaging 60 + 3.2% after T44C edit- 
ing by E14, 76 + 12% after G52A editing by 
E20, and 50 + 8.6% after A54G editing by E23 
(fig. S21). These data demonstrate that base 
editing of various exon 7 SREs can increase 
full-length SMN splice products. 

Base editing of 3’ SREs increased SMN pro- 
tein levels in ways that did not closely mirror 
observed improvements in exon 7 inclusion. 
We detected a 3.4-fold increase in SMN pro- 
tein by E14 base editing of T44C, a 23-fold 
increase by E20 editing of G52A, and a 1.6-fold 
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Fig. 2. Adenine base editing of SMN2 C6T. (A) Adenine base editing of SMN2 
C6éT (strategy D). (B) Target nucleotide position within the protospacer (P#) 
for base editing. A typical base editor activity window is illustrated as a heatmap. 
(C) The table shows ABE8e editing strategies with color-coded Cas-variant 
domains and their corresponding spacers. The protospacer position of the C6T 
target nucleotide (P#) is indicated. Graph shows genome editing outcomes in 
A7SMA mESCs. (D) Correlation of BE-Hive predicted editing outcomes with 
observed allele frequencies after base editing with ABE7.10 or ABE8e deaminases 
fused to different Cas variants. Pearson’s r is shown; 95% confidence interval 


(Cl) ranges are 0.9408 to 0.9998 for SpCas9, 0.5823 to 0.9201 for SpCas9 
engineered and evolved variants, and 0.7557 to 0.9689 for SpyMac Cas variants. 
(E) Plot of base editing efficiency and single-nucleotide editing precision of C6T by 
the indicated ABE and spacer combinations. (F) Exon 7 inclusion in SMN mRNA 
after editing by the indicated strategies, measured by automated electrophoresis. 
(G) SMN protein levels after editing by the indicated strategies, normalized to 
histone H3. (H) On-target and off-target base editing of strategy D1O in HEK293T 
cells. Bars show editing of the most frequently edited nucleotide at each locus, with 
the P# position shown in parentheses. Error bars indicate standard deviations. 


increase by E23 editing of A54G (Welch’s two- 
tailed t test, P = 0.02), despite all three edits 
inducing comparable improvements in exon 7 
inclusion (figs. S2I and 83, A and B). We hy- 
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pothesized that unintended bystander edits 
may underlie this persistent protein instability 
and found that the T44C and A54G editing 
strategies frequently ablate the nearby TAA 
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stop codon in exon 7 (fig. S2, F and G). A 
failure to terminate translation in exon 7 leads 
to the extension of full-length SMN proteins 
with the EMLA degron encoded by exon 8 
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(Fig. 1A). Thus, imprecise editing of T44C or 
A54G by E14 or E23 results in the translation 
of unstable full-length SMN-EMLA fusions that 
prevent up-regulation of SMN protein levels. 
Editing of G52A by E20 uses the EA-BE4 cyto- 
sine deaminase that does not recognize TAA 
as a substrate and therefore does not induce 
nonsilent bystander changes in 99 + 0.1% of 
edited alleles, resulting in a 23-fold increase 
in SMN protein levels. 

Base editing of exon 7 C6T resulted in the 
greatest up-regulation of SMN protein. The top 
six ABE8e editing strategies that correct C6T 
in >97% of alleles induced a 41-fold average 
increase in SMN protein levels compared with 
untreated controls (normalized to H3, Welch’s 
two-tailed ¢ test, P < 0.0002; Fig. 2G and fig. 
S3C), indicating complete rescue of normal 
SMN protein levels in A7SMA mESCs, which 
are reduced >95% relative to wild-type mESCs 
(47). Despite inducing a comparable increase 
in exon 7 inclusion, base editing of C6T enabled 
a 4.5-fold and 1.5-fold greater increase in SMN 
protein levels than risdiplam and nusinersen 
treatment of A7SMA mESCs (9-fold and 17-fold, 
respectively, compared with 41-fold on aver- 
age across the top six strategy D approaches; 
Figs. 1H and 2G and figs. SI, D, F, and G, and 
83, C, E, and F). Normal levels of SMN pro- 
tein are essential to the function, survival, and 
long-term health of all species in the animal 
kingdom (57-60). Restoring wild-type levels 
of SMN protein as achieved through a base 
editing strategy may thus best maximize the 
long-term health of SMA patients. 

Among all genome editing strategies tested, 
base editing of C6T by D10 induces the great- 
est increase in exon 7 inclusion (87 + 1.5%) and 
best recapitulates native SMN protein levels 
(95% of wild-type levels, a 38-fold increase 
versus untreated A7SMA mESCs). D10 base 
editing is highly efficient (99 + 0.7%) with high 
on-target precision (82 + 0.0%). The SMN2 
gene arose from a duplication of the chromo- 
somal region containing SMI and shares an 
identical promoter and >99.9% sequence iden- 
tity with SMN/1, including 100% DNA conser- 
vation of its protein-coding sequence other 
than exon 7 C6T (/, 4, 5). We performed reverse 
transcription-quantitative PCR and quantified 
SMN2 mRNA levels in edited cells, confirming 
that SMN2 mRNA abundance is not affected 
by D10 base editing compared with untreated 
A7SMA mESCs or after ABE8e transfection 
with an unrelated sgRNA (fig. S3G). Together, 
these data indicate that D10 editing of SMN2 
faithfully reproduces the genomic sequence 
and function of native SMN/ alleles. Therefore, 
we selected strategy D10 for further study. 


Off-target editing analysis of ABE8e targeting 
SMN2 C6T in the human genome 


Some base editors can induce off-target de- 
amination in cells, including Cas-dependent off- 
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target DNA editing and Cas-independent off- 
target DNA or RNA editing (67-65). Genomic 
and transcriptomic off-target deamination by 
adenine base editors without involvement of 
the Cas protein component is rare, and de- 
aminase variants that minimize these events 
have been reported (67, 66). We assessed the 
Cas-dependent genome specificity of the D10 
strategy (ABE8e-SpyMac and P8 sgRNA) char- 
acterizing SpyMac Cas9 nuclease with P8 
sgRNA using CIRCLE-seq (circularization for 
in vitro reporting of cleavage effects by se- 
quencing) (67), an unbiased and sensitive em- 
pirical in vitro off-target detection method. 
Potential off-target sites nominated by CIRCLE- 
seq can then be sequenced in-depth in base- 
edited human cells to provide a sensitive 
genome-wide analysis of off-target genome 
editing events induced by the D10 strategy 
(67, 68). 

We generated purified D10 strategy ribonu- 
cleoprotein (RNP) complexes containing SpyMac 
nuclease and P8 sgRNA to treat human ge- 
nomic DNA from human embryonic kidney 
(HEK) 293T cells in vitro and analyzed rare off- 
target genomic cleavage events (fig. S3H). We 
identified 55 candidate SpyMac-dependent 
DNA off-target loci nominated by the CIRCLE- 
seq method. Next, we measured on-target and 
genomic off-target editing at the top 23 
CIRCLE-seq-nominated loci in human cells 
(supplementary text, Fig. 2H, and fig. S31). 
We achieved 49 + 1.8% C6T on-target base 
editing at SMN2 in HEK293T cells and ob- 
served minimal base editing at SMI (0.15 + 
0.07%), which is generally absent in SMA 
patients. We detected minor levels of D10 
base editing at the off-target site ranked 19 
(0.41 + 0.14%), which is in an intergenic region 
of chromosome 15, and no evident base edit- 
ing (<0.03% over untreated cells) at the other 
21 assayed potential off-target loci. These data 
indicate high genomic target specificity of the 
D10 base editing strategy for the on-target 
locus. 

Together, these experiments did not detect 
any coding mutations or sequence changes of 
anticipated physiological significance in the 
human genome, and they support continued 
preclinical evaluation of the D10 strategy, in- 
cluding assessment of base editor off-target 
editing measured in various tissues that may 
accumulate over an extended period of time. 
We refer to the D10 editing strategy as the 
“ABE strategy” hereafter. 


Viral delivery of ABE enables efficient in vivo 
conversion of SMN2 C6T 


To enable in vivo SMN2 C6T conversion in an 
animal model of SMA, we designed an AAV 
strategy to package ABE8e-SpyMac and the 
P8 sgRNA for delivery (v6 AAV-ABE8e; sup- 
plementary text, Fig. 3A, and fig. S3J). The 
AAV serotype 9 (AAV9) has a well-established 
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tropism for neurons in the central nervous 
system (CNS) of a wide range of organisms, 
including A7SMA mice and human patients 
(69-71). In the cortex, AAV9 has been shown 
to almost exclusively target neurons (77), and 
intracerebroventricular (ICV) or systemic in- 
jection in neonates results in efficient trans- 
duction of spinal motor neurons to enable 
rescue of SMA disease phenotypes and lethal- 
ity in both mice and humans (13, 32, 72). Thus, 
we selected AAV9 for delivery of our D10 ABE 
strategy (““AAV9-ABE”) to A7SMA neonates by 
ICV injection to correct the SMN2 C6T target 
in vivo (Fig. 3B). 

We intracerebroventricularly injected SMA 
neonates with 2.7 x 10” vector genomes per 
kilogram of body weight (vg/kg) of the dual 
AAV9-ABE vectors, along with 2.7 x 10” vg/kg 
AAV9-Cbh-eGFP-KASH (Klarsicht/ANC-1/Syne-1 
homology domain, hereafter AAV9-GFP) (73) 
to serve as a viral transduction control. This 
dose is comparable to doses used for PNDO ICV 
AAV administration of Zolgensma for rescue 
of A7SMA mice, and of other base editor 
AAVs that enable efficient genome editing 
in mice (32, 73). We observed typical trans- 
duction patterns of AAV9 in the spinal cord 
(Fig. 3, C to E; supplementary text; and fig. 
S4A) (32, 33, 74). We quantified green fluo- 
rescent protein (GFP) and choline acetyl- 
transferase (ChAT) double-positive cells in 
the ventral horn of spinal cords from injected 
mice and observed a mean transduction effi- 
ciency of 43% in spinal motor neurons (Fig. 
3F), consistent with transduction efficiencies 
>20% previously shown to enable significant 
phenotypic rescue of A7SMA mice after ICV 
injection of self-complementary AAV9-SMN 
(Zolgensma) (32). Transduction of spinal mo- 
tor neurons using 2.97 x 10” vg/kg AAV9- 
GFP alone was similar (median: 46%) to 
transduction efficiencies using the 10-fold 
lower concentration of 2.7 x 10” vg/kg, sug- 
gesting that the low-dose cotransduction of 
AAV9-GFP accurately represents the subset of 
cells transduced by AAV9-ABE. 

Next, we assessed base editing in trans- 
duced cells (supplementary text and fig. S4B). 
We isolated cortical nuclei of treated animals 
and enriched for AAV9-transduction by sort- 
ing GFP-positive cells as previously described 
(73, 75). We observed 87 + 3.5% conversion of 
SMN2 C6T among transduced cells (Fig. 3G), a 
2.4-fold enrichment over unsorted tissue (37% 
+ 4.7%), with high single-nucleotide preci- 
sion for C6T alone (73 + 2.7%) and few indels 
(<0.4 + 0.1%) or bystander edits, similar to D10 
editing in A7SMA mESCs (Fig. 2E and figs. 
S2H and S4C). Collectively, these data con- 
firm that ICV injection of AAV9-ABE in 
A7SMA neonates enables efficient and precise 
conversion of SMN2 CéT in the CNS of treated 
animals with minimal undesirable by-products 
(55, 56, 76). 
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wild-type A7SMA mice at 25 weeks old, intracerebroventricularly injected on the graph. (I) Schematic of motor neuron differentiation (MND) and caudal 
PNDO and PNDI1 with AAV9-ABE, AAV9-GFP, or uninjected, as indicated. GFP indicates neural differentiation (CND) of A7SMA mESCs. EB, embryoid body; RA, 
transduction, ChAT labels spinal motor neurons in the ventral horn, NeuN labels retinoic acid; GDNF, glial cell line-derived neurotrophic factor; ND, neural 
postmitotic neurons, GFAP labels astrocytes, and 4',6-diamidino-2-phenylindole differentiation; GF, growth factor; SmAG, smoothened agonist; NG, neural growth. 


(DAPI) stains all nuclei. (F) Quantification of GFP and ChAT double-positive cells (J) Whole-transcriptome A-to-I RNA off-target editing analysis in A7SMAmESCs 
within the ventral horn (n = 3 animals). (G) Base editing in bulk and GFP+ flow-sorted — (n = 3) and CND (n = 3) and MND (n = 3) differentiated cells stably expressing the 
nuclei of A7SMA mice treated with AAV9-ABE+AAV9-GFP (n = 5), AAV9-GFP (n = 4), D1 strategy. Error bars indicate standard deviations. 
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Base editing conversion of C6T effectively 
converts the native SMN2 gene to SMNI, 
thereby restoring SMN protein levels to that 
of wild-type cells. Current SMA drugs induce 
non-native SMN levels (23, 24, 32-35) and re- 
quire repeated dosing or may fade over time. 
The permanent and precise editing of en- 
dogenous SMN2 genes that preserves native 
transcript levels and native regulatory mech- 
anisms governing SMN expression thus may 
address shortcomings of existing SMA thera- 
pies (/, 4, 5, 21, 28, 77). 


In vitro and in vivo DNA and RNA off-target 
analysis of ABE8e targeting SMN2 C6T 


In addition to the off-target analysis in human 
cells described above, we also assessed the 
DNA and RNA specificity of the ABE strategy 
in mouse cells in vitro and in vivo. We per- 
formed CIRCLE-seq and validated the top 35 
nominated sites in A7SMA mESCs (supple- 
mentary text and fig. S4D). We achieved 95 + 
0.0% on-target editing at the SMN2 transgene 
and only observed substantial editing at off- 
target site 5 in an intron of the mucin 16 gene 
(Mucl6, 31 + 1.9%) that is not expressed in the 
CNS (fig. S4E) (78). Next, we compared this 
analysis to off-target editing in vivo after AAV9- 
ABE ICV injection in A7SMA neonates by per- 
forming verification of in vivo off targets (VIVO) 
(79). We observed 10-27% (average 15 + 7%) 
editing at off-target site 5 in intron 54 of Mucl6 
and 0.1-0.9% (average 0.5 + 0.3%) editing at 
the noncoding off-target site rank 15, com- 
pared with 87 + 3.5% average on-target edit- 
ing of SMN2 among GFP-positive cells in the 
CNS across five animals (Fig. 3, G and H). 
These animals ranged from 4 to 18 weeks of 
age at the time of off-target analysis (26, 36, 
42, 80, and 127 days old), and we observed no 
increase in off-target editing events over time. 
Thus, off-target editing outcomes observed 
in cell culture experiments were consistent 
with those observed in vivo over 18 weeks (79). 
The ABE strategy did not induce any de- 
tected coding mutations in either human or 
mouse genomes, and off-target editing in vivo 
was lower than in cell culture (reduced by 50% 
at Muci6 intron 54), likely because of lower 
copy number and expression levels in trans- 
duced cells in vivo or in vivo gene silencing 
over time (33, 36, 37). 

Cas-independent RNA off-target adenine 
base editing in vivo is typically indistinguish- 
able from background A-to-I conversion owing 
to the low copy number of ABE-expressing 
transgenes (33, 80). We investigated RNA off- 
target editing in A7SMA mESCs and differ- 
entiated neural lineages, including motor 
neurons, that stably produce ABE8e from 
low gene copy numbers similar to those re- 
sulting from AAV9 transduction (Fig. 31; 
supplementary text; and fig. S4, F to H). Con- 
sistent with previous reports (80, 87), whole- 
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transcriptome sequencing did not reveal de- 
tected accumulation of RNA A-to-I edits over 
background levels of endogenous A-to-I and 
A-to-G changes (Fig. 3J and fig. S4G). 

Collectively, these in vitro and in vivo analy- 
ses did not reveal off-target edits of anticipated 
clinical or physiological significance in human 
or mouse cells, suggesting high target speci- 
ficity of the D10 base editing approach. Con- 
tinued preclinical assessment and minimization 
of off-target editing is important to ensure the 
safety of a potential base editing therapeutic 
for the treatment of SMA in patients. 


ABE-mediated rescue of SMA pathophysiology 

in mice 

The physiology of AAV9-ABE-treated A7SMA 
mice was improved compared with that of 
untreated animals (movies S1 and S2). We 
assessed the rescue of motor phenotypes 
by electrophysiological measurements in 
AAV9-ABE-treated A7SMA mice. We mea- 
sured compound muscle action potential 
(CMAP) amplitude and performed motor unit 
number estimation (MUNE) in the gastro- 
cnemius muscle to assess loss of motor neu- 
ron functional integrity, a key feature of SMA 
and preclinical SMA models (82). We com- 
pared outcomes with US Food and Drug 
Administration (FDA)-approved therapeutics 
for SMA including ICV injection of Zolgensma 
and daily intraperitoneal (IP) injection of 
risdiplam (Evrysdi) at doses that were previ- 
ously demonstrated to confer a survival bene- 
fit to these mice (3.3 x 10’° vg/kg Zolgensma 
and 0.1 mg/kg risdiplam; Fig. 4A) (30, 32). 
MUNE was reduced by 50% in untreated 
A7SMA animals compared with heterozygous 
mice at postnatal day (PND) 12, and Zolgensma 
or 0.1 mg/kg risdiplam showed little to no 
improvement (50 and 75% relative to hetero- 
zygotes, respectively; Kruskal-Wallis test, P > 
0.6). In contrast, MUNE in SMA mice treated 
with 3.3 x 10” vg/kg of AAV9-ABE was signif- 
icantly improved compared with untreated 
animals (Kruskal-Wallis test, P < 0.02) and did 
not significantly differ from heterozygous ani- 
mals, with values averaging 91% that of het- 
erozygotes. CMAP amplitudes were also higher 
for AAV9-ABE-treated mice compared with 
risdiplam-treated or untreated A7SMA mice, 
whereas CMAP amplitudes did not signific- 
antly differ between heterozygotes, Zolgensma- 
treated mice, and AAV9-ABE-treated animals 
(Kruskal-Wallis one-way analysis of variance, 
P > 0.2). Thus, neonatal ICV injection of AAV9- 
ABE measurably rescues SMA pathophysiol- 
ogy of spinal motor neurons. 

Next, we assessed survival of intracerebro- 
ventricularly AAV9-ABE-injected A7SMA mice. 
In SMA type I patients, therapeutic interven- 
tion can meaningfully improve disease out- 
comes if administered in the first several 
months of life (83-86); however, in ASMA 
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mice, survival drops precipitously when ani- 
mals receive treatment past PND6 (Fig. 4B) 
(87). This large difference is due in part to 
the highly accelerated (~150-fold greater) rate 
of maturation of mice compared with humans 
in the first month, early perinatal reduction in 
SMN expression that occurs in mice (88) and 
humans (28), and the rapid early-onset loss of 
motor units, which consist of spinal motor 
neurons and the muscle fibers that they in- 
nervate (82, 89). Restoration of SMN protein 
levels using inducible transgenes demonstrates 
that high levels of SMN are required by PND4 
to PND6 to rescue A7SMA mice, and delays 
of a small number of days are strongly anti- 
correlated with survival (32, 87, 88, 90-92). In 
cells, complete mRNA rescue is not achieved 
until 7 days after D10 transfection (fig. S5A), 
and the time to restore SMN protein levels in 
vivo surpasses the extremely short therapeu- 
tic window in A7SMA mice. 

The accumulation of SMN protein after 
transduction with the dual single-stranded 
AAV9-ABESe vectors used in this study requires 
completion of (i) second-strand synthesis of each 
AAV9-ABE genome (93, 94), (ii) transcription 
and translation of the split-intein ABE protein 
segments, (iii) assembly and trans-splicing of 
the split ABE protein, (iv) RNP assembly and 
base editing of SMN2, (v) transcription of full- 
length C6T-modified endogenous SMN2 pre- 
mRNA driven by its native promoter, and 
(vi) splicing and translation of corrected SMN2 
transcripts. Thus, the timing for SMN protein 
rescue after AAV9-ABE administration is slower 
than fast-acting splice-switching drugs or con- 
stitutive gene complementation from SMN 
cDNA encoded by a self-complementary AAV9- 
SMN vector such as Zolgensma (93-95). We 
recently demonstrated that in vivo base edit- 
ing affects protein levels by ~1 to 3 weeks 
after administration (80). 

Despite the incongruent timeline of base 
editing-mediated rescue for ideal rescue of 
A7SMA mice, AAV9-ABE increased the life 
span of treated animals by ~33% in two colo- 
nies in different institutions (supplementary 
text; Materials and methods; Fig. 4C; and fig. 
S5, B to D). Life span of treated animals im- 
proved from an average of 17 days (median: 
17 days; maximum: 20 days) to 23 days (me- 
dian: 22 days; maximum: 33 days; Mantel-Cox 
test, P < 0.02). As anticipated, the life-span 
extension resulting from AAV9-ABE treat- 
ment is similar to that achieved by scAAV9- 
SMN gene therapy in postsymptomatic (>PND7) 
A7SMA mice (Fig. 4B) (32, 72, 87, 92). Collec- 
tively, these data demonstrate that postnatal 
conversion of SMN2 C6T by AAV9-ABE res- 
cues SMA motor phenotypes in mice, includ- 
ing the number (MUNE) and output (CMAP) 
of functional motor units innervating muscle, 
and that the prolonged process of AAV9-ABE- 
mediated SMN restoration results in mostly 
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Fig. 4. AAV9-ABE-mediated rescue of A7SMA mice. (A) (Left) Motor unit 
number estimation (MUNE) and (right) compound muscle action potential (CMAP) 
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med: 18; Ing: 22). (C) Kaplan-Meier curve in AAV9-ABE treated (n = 6) and 
uninjected (n = 8) A7SMA mice. (D) Neonatal ICV co-injections with AAV9-ABE, 


AAV9-GFP, and nusinersen. (E) (Left) The time required for A7SMA mice to right 
themselves in the righting reflex assay at PND7. (Right) The hang time of A7SMA 
mice in the inverted screen test at PND25. (F) Analysis of voluntary movement by 
open field tracking at PND4O. (Left) Traveled distance in centimeters. (Right) Velocity 
in centimeters per second. (G and H) Body weight in grams and Kaplan-Meier curve 
of A7SMA mice. Graph line shading represents (G) standard deviation or (H) 95% 


amplitude at PND12 in heterozygotes (n = 11), and A7SMA mice treated with 
Zolgensma (n = 5), AAV9-ABE (n = 10), risdiplam (n = 8), or uninjected (n = 7). 
(B) Kaplan-Meier curve of A7SMA neonates intracerebroventricularly injected with 
Zolgensma from Robbins et al. (87) (data extracted using PlotDigitizer). Average 
(av), median (md), and longest (Ing) survival in days: untreated (avg: 13; med: 14; 


Ing: 15), PND2 (avg: 187; med: 204; Ing: 214), PND3 (avg: 102; med: 75; Ing: 182), 
PND4 (avg: 141; med: 167; Ing: 211), PND5 (avg: 76; med: 37; Ing: 211), PND6 
(avg: 73; med: 34; Ing: 211), PND7 (avg: 30; med: 28; Ing: 70), and PND8 (avg: 18; 


postsymptomatic rescue in A7SMA mice that 
results in a statistically significant, but limited, 
improvement in animal life span. 
Up-regulation of SMN protein levels improves 
motor function and life expectancy of SMA 
patients and animal models if achieved before 
the onset of neuromuscular pathology and 
symptoms (13, 32, 85-87, 92), yet even high 
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levels of SMN protein cannot correct neuro- 
muscular junction defects once SMA has pro- 
gressed to an advanced stage, and loss of motor 
neurons upon cell death is irreversible. We 
therefore sought to extend the effective ther- 
apeutic window for gene editing by transient 
early administration of an existing approved 
SMA drug to attenuate disease progression, 
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Cl. Animals were treated as indicated. Dots represent individual animals. *P < 0.02, 
**P < 0.01, ***P < 0.005, ****P < 0.001. Error bars indicate standard deviations. 


as has previously been applied to study milder 
forms of SMA in mice (72, 96, 97). Given that 
SMA patients in a gene editing clinical trial 
would likely be receiving an SMA drug, re- 
peating the base editing treatment in mice 
receiving an existing SMA drug would also 
inform a potential future clinical application 
of this approach. 
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Combination therapy improves the life span of 
ABE-treated SMA mice 

Transient SMA drug administration can ame- 
liorate SMA pathology and extend survival of 
A7SMA mice. We hypothesized that attenuat- 
ing disease progression using nusinersen 
could extend the unusually short therapeutic 
window of A7SMA mice and allow AAV9- 
ABE-mediated rescue to begin before exten- 
sive irreversible SMA damage occurs. The 
mechanism of nusinersen (binding to SMN2 
pre-mRNA) is orthogonal to base editing of 
SMN2 genes, and cotransfection of 20 nM 
nusinersen did not affect base editing out- 
comes or inclusion of exon 7 in spliced SUN 
transcripts after D10 in AZSMA mESCs (fig. S5, 
A and E). We assessed whether coadministra- 
tion of nusinersen can improve phenotypic 
rescue from AAV9-ABE treatment. A single 
ICV injection of nusinersen at PNDO has been 
shown to extend survival of A7SMA mice by 
several weeks (98), thus we co-injected a single 
low dose (1 pg) of nusinersen together with 
AAV9-ABE and AAV9-GFP in A7SMA neonates 
(supplementary text). As a control, we also 
treated A7SMA neonates with 1 ug nusinersen 
and AAV9-GFP but no base editor (Fig. 4D). 
We assessed motor coordination and overall 
muscle strength at PND7 using the righting 
reflex test, which measures the time needed 
for a mouse placed on its back to right itself 
(Fig. 4E). We observed a significant differ- 
ence between heterozygotes and nusinersen- 
treated or untreated A7SMA mice (Kruskal-Wallis 
test, P < 0.01) but no significant difference 
between mice treated with a combination of 
AAV9-ABE and nusinersen (hereafter AAV9- 
ABE+nusinersen) compared with heterozy- 
gous littermates (Kruskal-Wallis test, P > 0.1). 

Next, we assessed motor strength and coor- 
dination of treated and heterozygous mice 
using an inverted screen test, which measures 
how long a mouse can hang inverted from a 
screen mesh surface. At PND25, A7SMA ani- 
mals treated with nusinersen alone performed 
significantly worse at inverted screen testing 
than did healthy heterozygous mice (Kruskal- 
Wallis test, P = 0.007; Fig. 4E). In contrast, the 
AAV9-ABE+nusinersen-treated animals showed 
no significant difference in the inverted screen 
assay from healthy heterozygous mice. Nota- 
bly, half of nusinersen-only-treated animals 
were deceased by this time point, and age- 
matched untreated A7SMA mice do not sur- 
vive long enough for this PND25 assay. 

For a more complete behavioral assessment 
of treated and heterozygous animals, we per- 
formed extensive multiparametric analysis of 
voluntary movement by open field tracking at 
PND4O0 (Fig. 4F and fig. S5, F to J). Across 33 
parameters, including traveled distances, veloc- 
ity, duration, and counts of various activ- 
ities, the measured behaviors of AAV9-ABE+ 
nusinersen-treated animals showed no signif- 
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icant difference with those of heterozygous 
mice (Mann-Whitney test, P > 0.5). Neither 
nusinersen-only treated or untreated age- 
matched A7SMA mice were available as a ref- 
erence for this PND40 assay owing to their 
short life span. 

We also assessed the effect of AAV9-ABE+ 
nusinersen treatment on weight and life span 
of A7SMA mice. The weight of nusinersen-only- 
and AAV9-ABE+nusinersen-treated A7SMA 
mice steadily increased and was indistin- 
guishable for the first week of life, after which 
weight gain slowed in the nusinersen-only 
cohort (Fig. 4G). Combination-treated animals 
maintained, on average, 61 + 4.0% of the weight 
of heterozygous animals throughout their life 
spans. The nusinersen-only injection improved 
life span of A7SMA mice from an average of 
17 days (median: 17; maximum: 20 days; Fig. 
4C) to an average of 28 days (median: 29; 
maximum: 37 days; Mantel-Cox test, P = 0.0001; 
Fig. 4H). Notably, AAV9-ABE+nusinersen treat- 
ment improved survival of A7SMA mice to an 
average of 111 days (median: 77; Mantel-Cox 
test, P = 0.002), with >60% of animals sur- 
viving beyond nusinersen-only controls, and 
a 10-fold increase in maximum life span (37 
days maximum with nusinersen only com- 
pared with 360 days maximum with AAV9- 
ABE). AAV9-ABE+nusinersen-treated SMA 
mice also exhibited normal behavior and 
vitality well beyond the life span of nusi- 
nersen-only controls (P40, P96, and P200 in 
movies S3 to S5). Collectively, these data in- 
dicate that transient extension of the very 
narrow therapeutic window in A7SMA mice 
can greatly improve phenotypic rescue of 
SMA from base editing of SMN2. 

Whereas neonatal AAV9-ABE ICV injection 
alone enables life extension in A7SMA mice 
that resembles >PND7 ICV injection with 
Zolgensma (Fig. 4, B and C) (87), coadminis- 
tration of 1 ug nusinersen temporarily slows 
disease progression and broadens the narrow 
therapeutic window, allowing base editing 
the opportunity to enable life-span rescue 
that more closely resembles that of presymp- 
tomatic Zolgensma administration at <PND3 
(Fig. 4H). Moreover, these data demonstrate 
compatibility of AAV9-ABE with nusinersen 
as a one-time treatment without evident ad- 
verse effects and with apparent synergy to 
improve therapeutic outcomes. Such a com- 
bination therapy approach may play an im- 
portant role in future clinical trial designs for 
one-time SMA treatments that permanently 
correct a genetic cause of the disease and for 
clinical application in patients already receiv- 
ing treatment. 


Discussion 


Current treatment options for SMA have revo- 
lutionized care for thousands of patients, ef- 
fectively extending life span, preventing the 
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loss of motor function in presymptomatic 
patients, and delaying progression in sympto- 
matic patients by increasing full-length SMN 
protein levels (13, 24, 85, 86, 90, 99). However, 
current therapies do not restore endogenous 
protein levels and native regulation of SMN, 
which could result in pathogenic SMN insuf- 
ficiency in motor neurons or potential long- 
term toxicity in other tissues (21, 23-28, 35). 
Furthermore, the transient therapies nusi- 
nersen and risdiplam require repeated dos- 
ing throughout a patient’s lifetime, and it is 
unclear how long Zolgensma gene complemen- 
tation will persist in motor neurons (36, 37). 
Thus, achieving permanent and endogenously 
regulated rescue of SMN protein levels is an 
important goal of a future therapeutic for SMA 
patients. The optimized D10 ABE strategy 
developed in this work is a one-time treat- 
ment that enables permanent and precise 
editing of endogenous SMN2 genes while 
preserving native transcript levels and regu- 
latory mechanisms that govern SMN expres- 
sion (J, 4, 5, 28, 77, 100). As such, a future base 
editing therapeutic approach could offer sub- 
stantial benefits over existing SMA therapies. 

We compared a total of 79 nuclease and 
base editing strategies targeting five regions of 
SMN2 to induce either posttranscriptional or 
posttranslational regulatory changes in SMN2 
that up-regulate SMN protein production. BE- 
Hive and inDelphi machine learning models 
enabled the design of precise editing strategies 
that, in some cases, were not obvious and also 
helped preselect sgRNAs for genotypic and 
phenotypic validation of editing outcomes. All 
SMA patients, regardless of their SMN7 muta- 
tions, must carry the SMN2 gene to complete 
gestation (7), and thus the genome editing 
strategies identified in this study have the 
potential to benefit all SMA patients. 

While on-target Cas nuclease editing at SMN2 
can be precise, double-strand breaks (DSBs) can 
result in large deletions and chromosomal re- 
arrangements, especially when induced simulta- 
neously at multiple genomic loci (107). Given 
that SMA patients usually have multiple copies 
of SMN2, nuclease editing may result in un- 
intended restructuring of the chromosome re- 
gion (5q13) that harbors SMN genes (102, 103). 
In contrast, base editors precisely convert nu- 
cleotides without inducing DSBs (50, 104, 105) 
and result in greater SMN protein up-regulation 
than the nuclease strategies in this study (up 
to 50-fold by base editors compared with up to 
17-fold by nucleases). We therefore recom- 
mend that future gene editing therapeutic 
strategies for SMA use base editing rather 
than nucleases. 

ABE strategy D10 demonstrated high on- 
target efficiency and specificity, with minimal 
Cas-dependent or Cas-independent off-target 
DNA or RNA editing. It is possible that ex- 
tended base editor expression in cells, as can 
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result from AAV-delivery, could result in a 
greater accumulation of genomic and tran- 
scriptomic off-target events. Therefore, a deeper 
assessment of genomic and transcriptomic off- 
targets and efforts to minimize off-target edit- 
ing risk will be important in the preclinical 
development of a potential base editing thera- 
peutic for SMA. If needed, Cas-independent 
editing events can be further minimized by 
alternative delivery strategies that shorten 
exposure to base editors (61) and by the use of 
tailored deaminases such as the V1O6W var- 
iant of TadA*-8e (61, 63) or TadA-8.17-m (106). 

SMA has variable presentation in humans 
that largely correlates with the copy number 
of SMN2 (107-111). Type I SMA patients have 
two SMN2 copies and present with symptoms 
within the first 6 months of life, type II 
patients have three copies and present with 
symptoms by 18 months, whereas type III 
patients have 3 or 4 SMN2 copies and later 
onset of symptoms. Early intervention is para- 
mount to achieving the best outcomes for 
SMA patients. The window to effectively treat 
type II and III patients is broader than for 
type I patients, who should ideally receive 
treatment within the first few months of life 
and up to 18 months (13, 24, 83-86, 99). In- 
deed, we directly observed the critical role of 
differences in timing on the order of days in 
determining the efficacy of an AAV9-ABE 
treatment in A7SMA mice, which have an un- 
usually short (<6 days) therapeutic window 
compared with the time scale of base editing 
(weeks) (87). We show that the FDA-approved 
ASO drug nusinersen can extend the very 
short therapeutic window for rescue in A7SMA 
mice, allowing base editing-mediated rescue 
of SMN protein levels to occur to a greater 
extent (80). We anticipate that the broader 
therapeutic window in human SMA patients 
would provide ample opportunity for AAV9- 
ABE-mediated restoration of SMN protein 
levels to take place without the need for co- 
administration of a transient therapeutic. 
Nevertheless, our study demonstrates the 
compatibility of base editing with nusinersen 
as a combination therapy approach to treat 
SMA in animals, which may be valuable for 
future clinical applications. 

The intracerebroventricularly injected AAV9- 
ABE animals in our study exhibited mouse- 
specific peripheral disease phenotypes that are 
common in SMA mouse models, including 
necrosis of the extremities (712), while ex- 
hibiting otherwise normal behavior and vitality 
without displays of progressive muscle weak- 
ness. However, SMA treatment that is restricted 
to the CNS also reveals a later-onset (>2 months) 
lethal cardiac abnormality specific to A7SMA 
mice (32, 113-116), which likely underlies the 
sudden late-stage fatality observed in intra- 
cerebroventricularly AAV9-ABE-treated ani- 
mals in this study. Treating both CNS and 
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peripheral tissues may ameliorate this murine 
cardiac phenotype to improve life span of 
treated A7SMA mice compared with intra- 
cerebroventricularly injected animals (113, 117). 
Nevertheless, given that patients have been 
successfully treated intrathecally with Spin- 
raza, peripheral restoration of SMN protein 
does not appear to be required to rescue SMA 
lethality in humans (23, 31, 83, 90, 118). 

As demonstrated in this work, dual-AAV 
delivery of base editors supports therapeu- 
tic levels of editing in mouse models of human 
disease (119, 120). After these in vivo experi- 
ments were completed, our lab developed 
efficient in vivo base editing using single- 
AAV9-ABE systems that use size-minimized 
AAV vector components and one of a suite of 
small Cas protein domains that are highly 
active as ABEs (80). Such single-AAV base edit- 
ing systems may simplify the development of 
future base editor therapeutics and potentially 
minimize the required dose and potential side 
effects of AAV in clinical settings (727). 


Materials and methods 
Cell culture 


Culture of mESCs, HEK293T, and U20S cells 
was performed according to previously pub- 
lished protocols (122). mESCs were maintained 
on 0.2% gelatin-coated plates feeder-free in 
mESC media composed of Knockout Dulbecco’s 
Modified Eagle medium (DMEM; Life Tech- 
nologies) supplemented with 15% defined 
fetal bovine serum (FBS, HyClone), 0.1 mM 
nonessential amino acids (NEAA, Life Tech- 
nologies), Glutamax (GM, Life Technologies), 
0.55 mM 2-mercaptoethanol (b-ME, Sigma- 
Aldrich), IX ESGRO LIF (Millipore), with the 
addition of 2i: 5 nM GSK-3 inhibitor XV (Sigma- 
Aldrich), and 500 nM UO126 (Sigma-Aldrich). 
A7SMA mESCs were a kind gift from L. L. Rubin. 
HEK293T cells were purchased from ATCC 
(CRL-3216) and were maintained in DMEM 
(Life Technologies) supplemented with 10% 
FBS (Thermo Fisher Scientific). U2OS cells 
were purchased from ATCC (HTB-96) and were 
maintained in McCoy’s 5a medium (Life Tech- 
nologies) supplemented with 10% FBS (Thermo 
Fisher Scientific). All cells were regularly tested 
for mycoplasma. 

For genome editing experiments, cells were 
seeded 1 day prior to be ~70 to 80% confluent 
on the day of transfection and transfected with 
sgRNA and genome editing plasmids at a 1:1 
molar ratio using Lipofectamine 3000 (Thermo 
Fisher Scientific) in accordance with the man- 
ufacturer’s protocols. For stable integration of 
plasmids, cells were cotransfected with Tol2 
transposase at an equimolar ratio. Cells that 
did not undergo antibiotic selection were cul- 
tured for 3 to 5days before harvesting. For 
antibiotic selection, AZSMA mESCs were treated 
with 50 ug/ml hygromycin B (Life Technol- 
ogies) and/or 6.67 ug/ml blasticidin as indi- 


21 April 2023 


cated, starting 24 hours after transfection. 
For transient selection, antibiotics were re- 
moved from the media after 48 hours. Selected 
cells were allowed to recover and expand be- 
fore harvesting. All sgRNA sequences designed 
for this study are listed in the supplementary 
materials. 

For A7SMA mESC nusinersen experiments, 
cells were transfected with 20 nM of fully 
2'-O-methoxyethyl (MOE)-modified ASO (5'- 
TCACTTTCATAATGCTGG-3’) on a phosphor- 
othioate backbone (TriLink), using Lipofectamine 
3000 (Thermo Fisher Scientific). After 24 hours, 
media was replaced every other day with fresh 
mESC+2i media. For splicing rescue by risdi- 
plam, mESC media was supplemented with 
0.1 to 1uM of risdiplam (RG7916, Selleck Chem- 
icals LLC) in dimethyl sulfoxide, as indicated. 
Cells were harvested at the indicated time points. 


High-throughput sequencing of genomic DNA 


Sequencing library preparation was performed 
according to previously published protocols 
(50). Primers are listed in the supplementary 
materials. Briefly, we isolated genomic DNA 
(gDNA) with the QlAamp DNA mini kit 
(Qiagen) and used 250 to 1000 ng of gDNA for 
individual locus editing experiments and 
20 ug of gDNA for comprehensive context li- 
brary samples. Sequencing libraries were ampli- 
fied in two steps, first to amplify the locus of 
interest and second to add full-length Illumina 
sequencing adapters using the NEBNext Index 
Primer Sets 1 and 2 [New England Biolabs 
(NEB)] or internally ordered primers with 
equivalent sequences. All PCRs were performed 
using NEBNext Ultra IT Q5 Master Mix. Sam- 
ples were pooled using Tape Station (Agilent) 
and quantified using a KAPA Library Quanti- 
fication Kit (KAPA Biosystems). The pooled 
samples were sequenced using Illumina Next- 
Seq or MiSeq. Alignment of fastq files and 
quantification of editing frequency for indi- 
vidual loci were performed using CRISPResso2 
in batch mode (66). The editing frequency for 
each site was calculated as the ratio between 
the number of modified reads (i.e., containing 
nucleotide conversions or indels) and the total 
number of reads. Base editing characterization 
library analysis was performed as previously 
described (50). 


Quantification of SMN splice products 


We isolated mRNA from A7 mESCs with the 
RNeasy mini kit (Qiagen) and performed re- 
verse transcription using SuperScript IV 
(Thermo Fisher) according to the manufac- 
turer’s protocols. For targeted SMN2 splice 
product quantitation by qPCR, high-throughput 
sequencing (HTS), or automated electropho- 
resis, we performed reverse transcription with 
random hexamers. Inclusion of SMN2 exon 7 
was quantified by automated electrophoresis 
using Tape Station (Agilent). For unbiased 
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SMN2 splice product analysis by high-throughput 
sequencing, we performed reverse transcrip- 
tion using a custom oligo-dT primer with a 
Read 2 Illumina sequencing stub. The pooled 
samples were sequenced using Illumina MiSeq. 
All PCRs were performed using NEBNext Ultra 
II Q5 Master Mix, with the addition of Sybr 
Green for qPCR. Primers are listed in table S3. 


Western blot 


Cells harvested for Western blot were washed 
with ice-cold phosphate-buffered saline (PBS) 
and incubated at 4°C for 30 min while rock- 
ing in RIPA lysis buffer (Thermo Fisher) sup- 
plemented with 1 mM phenylmethylsulfonyl 
fluoride (Thermo Fisher) and cOmplete EDTA- 
free protease inhibitor cocktail (Roche). Lysates 
were clarified by centrifugation at 12,000 rpm 
at 4°C for 20 min. Lysates were normalized 
using bicinchoninic acid (BCA; Pierce BCA 
Protein Assay Kit) and combined with 4x 
Laemelli buffer (BioRad) and dithiothreitol 
(Thermo Fisher) at a final concentration of 
1 mM. We loaded 10 ug of reduced protein 
per gel lane and performed transfer with an 
iBlot 2 dry blotting system (Thermo Fisher) 
using the following program: 20 V for 1 min, 
then 23 V for 4 min, then 25 V for 2 min, for 
a total transfer time of 7 min. Blocking was 
performed at room temperature for 60 min 
with block buffer: 1% bovine serum albumin 
(BSA) in TBST (150 mM NaCl, 0.5% Tween-20, 
50 mM Tris-Cl, pH 7.5). Membranes were then 
incubated in primary antibody diluted in 
block buffer for 2 hours at room temperature. 
After a washing, secondary antibodies diluted 
in TBST were added and incubated for 1 hour 
at room temperature. Membranes were washed 
again and imaged using a LI-COR Odyssey. 
Wash steps were 3x 5-min washes in TBST. 
Primary antibodies used were mouse anti- 
human SMN (Proteintech 2C6D9), mouse anti- 
mouse and human SMN (Proteintech 3A8G1), 
and rabbit anti-histone H3 (Cell Signaling D1H2); 
secondary antibodies used were LI-COR IRDye 
680RD goat anti-rabbit (4926-68071) and goat 
anti-mouse (#926-68070). 


Base editor characterization library assay 


For characterization of the ABE8e-SpCas9 
base editor, we used mouse ESCs carrying the 
comprehensive context library according to 
previously published protocols (42, 50). Brief- 
ly, 15-cm plates with >10’ initial cells were 
transfected with a total of 50 ug of p2T-ABE8e- 
SpCas9 and 30 ug of Tol2 plasmid to allow 
for stable genomic integration with Lipofect- 
amine 3000 according to manufacturer pro- 
tocols and selected with 10 ug/ml blasticidin 
starting the day after transfection for 4 days 
before harvesting. We maintained an average 
coverage of ~300x per library cassette through- 
out. We collected gDNA from cells 5 days after 
transfection, after 4 days of antibiotic selection. 
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Cloning 
Base editor plasmids were constructed by re- 
placing deaminase and Cas-protein domains of 
the p2T-CMV-ABE7.10-BlastR (Addgene 152989) 
plasmid by USER cloning (NEB) (50). Individual 
sgRNAs were cloned into the SpCas9-hairpin 
U6 sgRNA expression plasmid (Addgene 71485) 
using BbsI plasmid digest and Gibson as- 
sembly (NEB). Protospacer sequences and 
gene-specific primers used for amplification 
followed by HTS are listed in table S1. Con- 
structs were transformed into Mach1 chem- 
ically competent Escherichia coli (Thermo 
Fisher) grown on LB agar plates, and liquid 
cultures were grown in LB broth overnight at 
37°C with 100 ug/ml ampicillin. Individual 
colonies were validated by TempliPhi rolling 
circle amplification (Thermo Fisher) followed 
by Sanger sequencing. Verified plasmids were 
prepared by mini, midi, or maxiprep (Qiagen). 
AAV vectors were cloned by Gibson assem- 
bly (NEB) using NEB Stable Competent E. coli 
(High Efficiency) to insert the sgsRNA sequence 
and C-terminal base editor half of ABE8e- 
SpyMac into v5 Cbh-AAV-ABE-NpuC+U6- 
sgRNA (Addgene 137177), and the N-terminal 
base editor half and a second U6-sgRNA cas- 
sette into v5 Cbh-AAV-ABE-NpuN (Addgene 
137178) (73). 


Neural differentiation 


Differentiation of A7ZS3>MA mESCs was per- 
formed according to established protocols 
(123, 124). Briefly, AZSMA mESCs maintained 
on 0.2% gelatin-coated plates feeder-free in 
mESC media + 2i were plated onto irradiated 
mouse embryonic fibroblast (MEF) feeders on 
0.2% gelatin-coated plates in mESC media for 
7 days to wean cells from 2i factors. Cells were 
then seeded at 10° in 10-cm tissue culture- 
treated plates for 48 hours for priming and 
depletion of feeders. Media was replaced with 
neural differentiation (ND) media composed 
of 1:1 DMEM:F12 and Neurobasal media (Life 
Technologies) supplemented with 10% knock- 
out serum-replacement (KOSR, Life Technol- 
ogies), Glutamax (GM, Life Technologies), and 
0.55 mM 2-mercaptoethanol (b-ME, Sigma- 
Aldrich) for one hour before trypsinization 
and seeding of 2 x 10° cells in 10-cm non- 
tissue culture-treated dishes for 24 hours. 
Single cells and small early embryoid bodies 
(EBs) in suspension were collected and trans- 
ferred to 10-cm tissue culture-treated plates in 
fresh ND media for 24 hours. Small EBs that 
remained in suspension were collected and 
transferred to 10-cm tissue culture-treated 
plates in fresh ND media with the addition of 
1 uM retinoic acid (RA; Sigma-Aldrich R2625) 
for caudal neural differentiation (CND), or 
with 1 uM RA and 0.5 uM smoothened agonist 
(SmAg; Calbiochem 566660) for motor neuron 
differentiation (MND) for 72 hours. Large EBs 
were collected and split into two 10-cm tissue 
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culture-treated plates in neural growth (NG) 
media composed of 1:1 DMEM:F12 and Neuro- 
basal media supplemented with GM, B27 (Life 
Technologies), and 10 ng/ml human recombi- 
nant glial cell line-derived neurotrophic factor 
(GDNF; R&D Systems 212-GD-010) for 48 hours. 
EBs were monitored for Mnx1:GFP expression 
to assess motor neuron differentiation effi- 
ciency and imaged using a Zeiss inverted fluo- 
rescence microscope or collected for downstream 
whole-transcriptome analysis. 


Whole-transcriptome RNA sequencing 


Library preparation, sequencing and analysis 
were performed by SMART-seq2 as previously 
described (125). Briefly, total RNA was har- 
vested from cells using the RNeasy Mini kit 
(Qiagen). First, we incubated 20 ng purified to- 
tal RNA with RNase inhibitor (Clontech Takara 
2313B), deoxynucleoside triphosphate (dNTP) 
mix (Thermo Fisher RO192), and the 3'-RT primer 
[5'-AAGCAGTGGTATCAACGCAGAGTAC(T30) 
VN-3'] at 72°C for 3 min to anneal the RT 
primer. Next, we performed first-strand syn- 
thesis using the template switching oligo (TSO): 
(5'-AGCAGTGGTATCAACGCAGAGTACrGrG 
+G-3' Exiqon, Qiagen) together with RNase 
inhibitor, betaine (Sigma Aldrich BO300-IVL), 
MgCl, (Sigma Aldrich 1028), and Maxima 
RNase H-minus RT (Thermo Fisher EP0751), 
according to the manufacturer’s protocols. We 
performed preamplification of first-strand lib- 
raries with the ISPCR primer: 5’-AAGCAGTG- 
GTATCAACGCAGAGT-3’ using KAPA HiFi 
HotStart (KAPA KK2601) and SYBR green 
(Thermo Fisher). Whole-transcriptome ampli- 
fication (WTA) product was washed using 
DNA SPRI beads (Beckman Coulter A63881) 
and quantified by Agilent Tapestation. We per- 
formed Tagmentation and library preparation 
of 0.25 ng WTA using the Nextera XT kit 
(lumina) and Nextera i7 and Nextera i5 bar- 
coding primers. Samples were pooled and 
washed using DNA SPRI beads and quantified 
by Agilent Tapestation and the KAPA Univer- 
sal Library Quantification kit (Roche KK4824). 
Libraries were run on Illumina NextSeq 550. 
FASTQs were generated using bcl2fastq 
v2.20 and processed with Trim Galore v0.6.7 in 
paired-end mode with default parameters to 
remove low-quality bases, adapter sequences, 
and unpaired sequences. Trimmed reads were 
aligned to the GENCODE mouse reference ge- 
nome M31 (GRCm39) using STAR (v2.7.10a), 
quantified using kallisto (126), and refined to 
canonical coding sequences using CCDS re- 
lease 21 (127). For RNA A-to-I off-target anal- 
ysis, REDItools v1.3 was used to quantify the 
average frequency of A-to-I editing among all 
sequenced adenosines in each sample (728), 
excluding adenosines with read depth <10 or 
a read quality score <30. The transcriptome- 
wide A-to-I editing frequency was calculated 
independently for each biological replicate as: 
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(number of reads in which an adenosine was 
called as a guanosine)/(total number of reads 
covering all analyzed adenosines). 


Purification of SpyMac Cas nuclease protein 


SpyMac Cas nuclease protein was cloned into 
the expression plasmid pD881-SR (Atum, Cat. 
No. FPB-27E-269). The resulting plasmid was 
transformed into BL21 Star DE3 competent 
cells (Thermo Fisher, Cat. No. C601003). Colo- 
nies were picked for overnight growth in ter- 
rific broth (TB) and 25 ug/ml kanamycin at 
37°C. The next day, 2 liters of prewarmed TB 
were inoculated with overnight culture at a 
starting ODgoo of 0.05. Cells were shaken at 
37°C for ~2.5 hours until the OD¢ggo9 was ~1.5. 
Cultures were cold-shocked in an ice-water 
slurry for 1 hour, after which L-rhamnose was 
added to a final concentration of 0.8% to in- 
duce protein production. Cultures were then 
incubated at 18°C with shaking for 24 hours 
to produce protein. After induction, cells were 
pelleted and flash-frozen in liquid nitrogen 
and stored at —80°C. The next day, cells were 
resuspended in 30 ml cold lysis buffer (1 M 
NaCl, 100 mM Tris-HCl pH 7.0, 5 mM tris(2- 
carboxyethyl)phosphine (TCEP), 20% glycerol, 
with five tablets of cCOmplete, EDTA-free pro- 
tease inhibitor cocktail (Millipore Sigma, Cat. 
No. 4693132001). Cells were passed three times 
through a homogenizer (Avestin Emulsiflex-C3) 
at ~18,000 psi to lyse. Cell debris was pelleted for 
20 min using a 20,000g centrifugation at 4°C. 
Supernatant was collected and spiked with 
40 mM imidazole, followed by a 1-hour incu- 
bation at 4°C with 1 ml of Ni-NTA resin slurry 
(G Bioscience Cat. No. 786-940, prewashed once 
with lysis buffer). Protein-bound resin was 
washed twice with 12 ml of lysis buffer in a 
gravity column at 4°C. Protein was eluted in 3 ml 
of elution buffer (800 mM imidazole, 500 mM 
NaCl, 100 mM Tris-HCl pH 7.0, 5 mM TCEP, 10% 
glycerol). Eluted protein was diluted in 40 ml 
of low-salt buffer (100 mM Tris-HCl, pH 7.0, 
1mM TCEP, 20% glycerol) just before loading 
into a 50 ml Akta Superloop for ion exchange 
purification on the Akta Pure25 FPLC. Ion 
exchange chromatography was conducted on 
a5 ml GE Healthcare HiTrap SP HP pre-packed 
column (Cat. No. 17115201). After washing 
the column with low-salt buffer, the diluted 
protein was flowed through the column to 
bind. The column was then washed in 15 ml 
of low-salt buffer before being subjected to 
an increasing gradient to a maximum of 80% 
high-salt buffer (1 M NaCl, 100 mM Tris-HCl, 
pH 7.0, 5 mM TCEP, 20% glycerol) over the 
course of 50 ml, at a flow rate of 5 ml per 
minute. 1-ml fractions were collected during 
this ramp to high-salt buffer. Peaks were as- 
sessed by SDS-polyacrylamide gel electropho- 
resis to identify fractions containing the desired 
protein, which were concentrated first using 
an Amicon Ultra 15-ml centrifugal filter (100-kDa 
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cutoff, Cat. No. UFC910024), followed by a 
0.5-ml 100-kDa cutoff Pierce concentrator (Cat. 
No. 88503). Concentrated protein was quanti- 
fied using a BCA assay and determined to be 
12.6 mg/ml (Thermo Fisher, Cat. No. 23227). 


CIRCLE-seq off-target editing analysis 


Off-target analysis using CIRCLE-seq was per- 
formed as previously described (67, 129). Brief- 
ly, genomic DNA from HEK293T cells or 
NIH3T3 cells was isolated using Gentra Pure- 
gene Kit (Qiagen) according to manufacturer’s 
instructions. Purified genomic DNA was sheared 
with a Covaris S2 instrument to an average 
length of 300 base pairs (bp). The fragmented 
DNA was end-repaired, poly-A tailed, and lig- 
ated to an uracil-containing stem-loop adaptor 
using the KAPA HTP Library Preparation Kit, 
PCR Free (KAPA Biosystems). Adaptor-ligated 
DNA was treated with Lambda Exonuclease 
(NEB) and E£. coli Exonuclease I (NEB), then 
with USER enzyme (NEB) and T4 polynucleo- 
tide kinase (NEB). Intramolecular circulariza- 
tion of the DNA was performed with T4 DNA 
ligase (NEB) and residual linear DNA was 
degraded by Plasmid-Safe ATP-dependent 
DNase (Lucigen). In vitro cleavage reactions 
were performed with 250 ng of Plasmid-Safe 
ATP-dependent DNase-treated circularized 
DNA, 90 nM of SpyMac Cas9 nuclease protein, 
Cas9 nuclease buffer (NEB), and 90 nM of syn- 
thetic chemically modified sgRNA (Synthego), 
in a total volume of 100 ul. Cleaved products 
were poly-A tailed, ligated with a hairpin adap- 
tor (NEB), treated with USER enzyme (NEB), 
and amplified by PCR with barcoded univer- 
sal primers NEBNext Multiplex Oligos for 
Illumina (NEB), using Kapa HiFi Polymerase 
(KAPA Biosystems). Libraries were sequenced 
with 150-bp paired-end reads on an Illumina 
MiSeq instrument. CIRCLE-seq data analyses 
were performed using open-source CIRCLE-seq 
analysis software and default recommended 
parameters (https://github.com/tsailabSJ/ 
circleseq). 


Husbandry of A7SMA mice 


All experiments in animals were approved by 
the Institutional and Animal Care and Use 
Committee of the Broad Institute of MIT and 
Harvard and Ohio State University (OSU). 
A7SMA heterozygous mice (Smn"”-; SMN2*”*; 
SMNA7‘“*) were purchased from the Jackson 
Laboratory (005025) (54) and maintained in 
the Broad Institute and OSU vivaria according 
to recommendations in the Guide for the Care 
and Use of Laboratory Animals of the National 
Institutes of Health. Pairs of A7SMA hetero- 
zygotes were crossed to generate A7 SMA mice 
(Smn-; SMN2*”*; SMNA7‘”*). On date of birth 
(PNDO), pups were microtattooed on the foot 
pads (Aramis) with animal-grade permanent 
ink (Ketchum) using a sterile hypodermic 
needle (BD) to enable identification of indi- 
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vidual pups. Subsequently, biopsies of ~1 mm 
tissue were taken from the tail using a sterile 
blade, lysed for genomic DNA extraction, and 
used for genotyping by PCR. Litter size was 
controlled to five pups, including 1 to 3 ho- 
mozygous mutants, by culling and cross- 
fostering among same-age mice. Mice of both 
sexes were included in the study, although sex 
has been reported to not have a substantial 
impact on the phenotype of SMA mice (Treat- 
NMD SOP Code: SMA_M.2.2.003). 
Electrophysiology experiments were per- 
formed at OSU. All other animal studies were 
performed at the Broad Institute unless indi- 
cated otherwise in the text. At the Broad In- 
stitute, the mean birthweight of heterozygous 
animals was 1.7 + 0.1 g, and 1.5 + 0.1 g for SMA 
pups, and any animal weighing <1.5 g at time 
of birth was excluded from the study. The 
average weight of SMA neonates at injection 
on PNDO at the Broad Institute was 1.6 + 0.2 g. 
At OSU, the mean birthweight of heterozygous 
animals on the day of birth was 1.3 + 0.1 g and 
1.2 + 0.1 g for SMA pups, and any SMA, het- 
erozygous, or wild-type pup weighing <1.0 g 
at time of birth was excluded from the study. 
The average weight of SMA neonates at injec- 
tion on PNDO at OSU was 1.3 + 0.13 g, and ani- 
mals were injected with 3.3 x 10” vg/kg of the 
dual AAV-ABE vectors. By facility, each litter was 
subjected to the same exclusion criterion (Treat- 
NMD SOP Code: SMA_M.2.2.003). Cohort sizes 
were chosen on the basis of prior experience 
with these animals, known to allow for deter- 
mination of statistical significance. Animals were 
monitored daily for morbidity and mortality 
and weighed every other day from day of birth. 


Intracerebroventricular injections 


Neonatal ICV injections were performed as 
previously described (73, 130). Briefly, glass 
capillaries (Drummond 5-000-1001-X10) were 
pulled to a tip diameter of ~100 um. High-titer 
qualified AAV was obtained through the Viral 
Vector Core at UMass Medical School and con- 
centrated using Amicon Ultra-15 centrifugal 
filter units (Millipore), quantified by qPCR 
(AAVpro Titration Kit v.2, Clontech), and 
stored at 4°C until use. For injection, a small 
amount of Fast Green was added to the AAV 
injection solution to assess ventricle targeting. 
The injection solution was loaded through 
front-filling using the included Drummond 
plungers. A7SMA pups were anesthetized by 
placement on ice for 2 to 3 min, until they 
were immobile and unresponsive to a toe 
pinch. Up to 4.5 ul of injection mix was in- 
jected freehand into each ventricle on PNDO 
and PND1. 


Immunofluorescence imaging of spinal 
cord sections 


For immunofluorescence staining of trans- 
duced spinal motor neurons, A7SMA mice 
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were perfused at 25 weeks with ice-cold PBS 
and ice-cold 4% paraformaldehyde (PFA), the 
CNS was exposed, and the whole carcass was 
fixed overnight in 4% PFA. Whole spinal cord 
was isolated and fixed in 4% PFA overnight, 
then consecutively transferred to 10%, 20%, 
and 30% sucrose in three overnight incuba- 
tions before embedding in OCT for long-term 
storage at -80°C. Embedded tissue was cryo- 
sectioned and stained with goat anti-ChAT 
(Millipore AB144P), mouse anti-NeuN (EMD 
Millipore MAB377), mouse anti-GFAP (Sigma- 
Aldrich MAB3402), rabbit anti-GFP (Thermo 
scientific A-11122), and Alexa-Fluor secondary 
antibodies (Life Technologies) and imaged on 
an SP8 confocal microscope (Leica). 


Nuclear isolation and sorting of tissues 


Tissue harvest and nuclear isolation was per- 
formed as previously described (73). Briefly, 
deceased A7SMA mice were stored at —80°C 
until dissection of the brain and spinal cord tis- 
sue. The cortex and cerebella were separated 
from the brain postmortem using surgical 
scissors. Hemispheres were separated using a 
scalpel, and the cortex was separated from un- 
derlying midbrain tissue with a curved spatula. 
For nuclear isolation, dissected tissue was ho- 
mogenized using a glass dounce homogenizer 
(Sigma D8938) (20 strokes with pestle A fol- 
lowed by 20 strokes with pestle B) in 2 ml ice- 
cold EZ-PREP buffer (Sigma NUC-101). Samples 
were incubated for 5 min with an additional 
2 ml EZ-PREP buffer. Nuclei were centrifuged 
at 500g for 5 min, and the supernatant re- 
moved. For spinal cord tissue, wash steps were 
repeated 10 times. Samples were resuspended 
with gentle pipetting in 4 ml ice-cold nuclei sus- 
pension buffer (NSB) consisting of 100 ug/ml 
BSA and 3.33 uM Vybrant DyeCycle Ruby 
(Thermo Fisher) in PBS and centrifuged at 
500g for 5 min. The supernatant was removed, 
and nuclei were resuspended in 1 to 2 ml NSB, 
passed through a 35-um strainer, and sorted 
into 200 ul Agencourt DNAdvance lysis buffer 
using a MoFlo Astrios (Beckman Coulter) at the 
Broad Institute flow cytometry core. All steps 
were performed on ice or at 4°C. Genomic DNA 
was purified according to the DNAdvance 
(Agencourt) instructions for 200 ul volume. 


Behavioral assays 


Righting reflex was recorded on PND7 by 
placing neonates on their backs and record- 
ing with a stopwatch, up to a maximum of 
30 s, the duration of time that it took for the 
mice to right themselves. For inverted screen 
testing, we subjected juvenile mice to the hori- 
zontal grid test for mice (Maze Engineers) on 
PND25 by placing the animals on a wire-mesh 
screen, which the mice are capable of gripping, 
then inverting the screen over the course of 
2s, animal head first, over a padded surface 
made of 4- to 5-cm-high bedding. The time 
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it took for the animal to fall onto the bedding 
was recorded with a stopwatch. Each mouse 
was assessed with three measurements. The 
procedure concluded when the animal fell 
onto the bedding, or if the animal exceeded 
120 s for the measurement, in which case the 
screen was reverted so that the mouse was 
upright, and the mouse was manually removed 
from the screen. 

Voluntary movement of adult mice was 
recorded on PND40 by open field testing 
(Omnitech Electronics). Mice were brought 
into the testing room under normal lighting 
conditions and allowed 30 to 60 min of ac- 
climation. The animals were placed into the 
locomotor activity chamber with infrared 
beams crossing the a, y, and 2 axes that 
plotted their ambulatory and fine motor move- 
ments and rearing behavior. Recordings are 
analyzed using Fusion 5.1 SuperFlex software. 


Electrophysiological measurements 


Compound muscle action potential (CMAP) 
and motor unit number estimate (MUNE) mea- 
surements were performed as previously de- 
scribed (137). Briefly, at PND12, the right sciatic 
nerve was stimulated with a pair of insulated 
28-gauge monopolar needles (Teca, Oxford 
Instruments Medical, NY) placed in proxim- 
ity to the sciatic nerve in the proximal hind 
limb. Recording electrodes consisted of a pair 
of fine ring wire electrodes (Alpine Biomed, 
Skovlunde, Denmark). The active recording 
electrode (E1) was placed distal to the knee 
joint over the proximal portion of the triceps 
surae muscle, and the reference electrode (E2) 
was placed over the metatarsal region of the 
foot. A disposable strip electrode (Carefusion, 
Middleton, WI) was placed on the tail to serve 
as the ground electrode. For CMAP, supramax- 
imal responses were generated maintaining 
stimulus currents <10 mA, and baseline-to-peak 
amplitude measurements were made. 

For MUNE, an incremental stimulus tech- 
nique similar to a previously described proce- 
dure was used (137). Submaximal stimulation 
was used to obtain 10 incremental responses 
to calculate the average single motor unit 
potential (SMUP) amplitude. The first incre- 
ment was obtained by delivering square wave 
stimulations at 1 Hz at an intensity between 
0.21 and 0.70 mA to obtain the minimal all-or- 
none SMUP response. If the initial response 
did not occur with stimulus intensity between 
0.21 and 0.70 mA, the stimulating cathode 
position was adjusted either closer to or far- 
ther away from the position of the sciatic 
nerve in the proximal thigh to decrease or 
increase the required stimulus intensity, re- 
spectively. This first incremental response was 
accepted if three duplicate responses were 
observed. To obtain the subsequent incre- 
mental responses, the stimulation intensity 
was adjusted in 0.03-mA steps, and incremen- 
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tal responses were distinguished visually in 
real-time to obtain nine additional incre- 
ments. To be accepted, each increment was 
required to be: (i) observed for a total of three 
duplicate responses, (ii) visually distinct from 
the prior increment, and (iii) at least 25 uV 
larger than the prior increment. The peak- 
to-peak amplitude of each individual incre- 
mental response was calculated by subtracting 
the amplitude of the prior response. The 10 
incremental values were averaged to estimate 
average peak-to-peak SMUP amplitude. The 
maximum CMAP amplitude (peak-to-peak) 
was divided by the average SMUP amplitude 
to yield the MUNE. 


Statistical analysis 


Welch’s two-tailed ¢ tests were used to com- 
pare sequencing, splicing, mRNA levels, and 
immunostaining data. Error bars represent 
standard deviations of =3 independent biolog- 
ical replicates. Root mean squared error 
(RMSE) and Pearson’s 7-correlation were used 
for correlation analysis of predicted and ob- 
served genome editing outcomes, where ap- 
propriate. Kruskal-Wallis tests were used to 
compare physiology measurements and be- 
haviors of mouse cohorts under experimen- 
tal conditions. Mann-Whitney tests were used 
to compare multiparametric measurements of 
voluntary behaviors of mouse cohorts. The 
logrank Mantel-Cox test was used to compare 
body weight and life span of mouse cohorts. All 
statistical tests were calculated by GraphPad 
Prism 9.4.1 and Microsoft Excel v16.64. 
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SLEEP 


Brain activity of diving seals reveals 


short sleep cycles at depth 


Jessica M. Kendall-Bar'*, Terrie M. Williams’, Ritika Mukherji°, Daniel A. Lozano, Julie K. Pitman’, 
Rachel R. Holser®, Theresa Keates®, Roxanne S. Beltran’, Patrick W. Robinson”, Daniel E. Crocker’, 
Taiki Adachi’, Oleg |. Lyamin®°, Alexei L. Vyssotski°, Daniel P. Costa” 


Sleep is a crucial part of the daily activity patterns of mammals. However, in marine species that spend 
months or entire lifetimes at sea, the location, timing, and duration of sleep may be constrained. To 
understand how marine mammals satisfy their daily sleep requirements while at sea, we monitored 
electroencephalographic activity in wild northern elephant seals (Mirounga angustirostris) diving in Monterey 
Bay, California. Brain-wave patterns showed that seals took short (less than 20 minutes) naps while 

diving (maximum depth 377 meters; 104 sleeping dives). Linking these patterns to accelerometry and the 
time-depth profiles of 334 free-ranging seals (514,406 sleeping dives) revealed a North Pacific sleepscape in 
which seals averaged only 2 hours of sleep per day for 7 months, rivaling the record for the least sleep 
among all mammals, which is currently held by the African elephant (about 2 hours per day). 


cross the animal kingdom, sleep is crit- 

ical for energy conservation, immune 

function, memory, and learning (J). Dis- 

ruptions to sleep, including obstructive 

sleep apnea and shift work, negatively 
affect human health (2, 3). By comparison, 
diverse sleeping habits among wild animals 
reflect adaptations to resolve conflicts between 
sleeping or feeding while avoiding predation 
and exhaustion (4-6). In response to these 
trade-offs, cows sleep-chew, horses sleep- 
stand, ostriches sleep-stare, and frigate birds 
sleep-fly (7-10). 

Marine mammals face unique challenges in 
obtaining adequate daily sleep (7). Most of 
them feed underwater and breathe at the ocean 
surface, where predators typically attack (72). 
Activity budgets of mammals at sea reflect the 
balance between these survival needs, which 
often push the animals toward physiological 
extremes such as large body size, prolonged 
activity, and enhanced oxygen stores (13). For 
example, northern elephant seals (Wirounga 
angustirostris) travel >10,000 km during 
7-month-long foraging trips. Seals minimize 
time at the surface (~2 min between 10- to 
30-min dives) to reduce predation risk by killer 
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whales and white sharks while maximizing 
foraging time (72, 14-16). They also feed around 
the clock on small prey to satisfy the energy 
requirements associated with their large body 
size (17). Given these ecophysiological demands, 
a long-standing question has been when, where, 
and how do seals sleep at sea? 


A new tool to detect sleep at sea 


We developed a new submersible system to 
record brain activity (electroencephalogram, 
EEG) and heart rate (electrocardiogram, ECG) 
concurrently with dive depth and motion of 
elephant seals at sea [Fig. 1E; (78)]. These sen- 
sors identified sleep states [rapid eye move- 
ment (REM) sleep and slow-wave sleep (SWS); 
figs. S1 to S3 and table $2], swimming effort 
(stroke rate), and three-dimensional (3D) 
diving behavior in freely moving female ju- 
venile seals (m = 13 seals) (18). We recorded 
sleep in a controlled laboratory environment 
(n = 5 seals) and in the wild (n = 8 seals) at 
four locations, including on the beach, in shal- 
low water, offshore along the continental shelf 
(depth <250 m), and in the open ocean (depth 
>250 m; table S1). EEG recordings allowed 
us to pair sleep states with diving behavior 
recorded in time-depth profiles for juveniles 
over multiple days at sea (104 sleeping dives). 
We used these sleep signatures to estimate 
sleep patterns across >3 million dives by 334 
free-ranging adult females over prolonged 
trips at sea (514,406 sleeping dives across 
53,581 recording days). 

On atypical sleeping dive, seals transitioned 
from an awake glide into SWS. Although 
asleep, they could maintain their upright 
posture for several minutes (Fig. 1 and movie 
S1). These results underscore the importance 
of EEG in assessing sleep state (78). As seals 
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shifted from SWS to REM sleep, sleep pa| Chec 
sis resulted in a loss of postural control. S.—.— 
turned upside down and drifted downwards 
in a “sleep spiral.” Sleep spirals tightened from 

a median diameter of 7.5 + 7.9 m [median + 
interquartile range (IQR)] at 71 + 97s (median + 
IQR) in SWS to 3.3 + 3.5 m loops at 40 + 29 s 
in REM (Fig. 1). Sleep spirals consisted of two 
to 13 consecutive 360-degree loops at 82 to 
377 m depth. On the continental shelf, seals 
slept motionless on the ocean floor at 64 to 
249 m depth. 


The predation risk of sleep at sea 


Among marine mammals, unihemispheric 
sleep (SWS in only one hemisphere) allows 
captive cetaceans and otariids (fur seals and 
sea lions) to swim and keep one eye open during 
sleep (11, 19). This suggests that cetaceans and 
otariids can sleep while monitoring predators 
(20, 21). Unihemispheric sleep has not been 
detected in captive true seals (family Phocidae) 
such as elephant seals (22). Similarly, our study 
did not reveal sleep asymmetry between hemi- 
spheres (<2-fold difference). This suggests that 
true seals use an alternative solution to mit- 
igate predation risk. This study experimen- 
tally confirms the hypothesis (22, 23) that in 
the absence of unihemispheric sleep, elephant 
seals’ extreme diving abilities allow sleep deep 
below the ocean surface, out of view of visual 
predators. 

The sleep paralysis that co-occurs with REM 
sleep would make seals especially vulnerable 
to predation (7). REM is often minimized for 
aquatic mammals because the accompanying 
paralysis can also prevent access to air (22). In 
captive fur seals confined to water, REM is 
virtually eliminated (24). Elephant seals at sea 
reduce REM sleep, as is seen in captive fur seals 
and true seals in water (24-27), but unexpec- 
tedly exhibit a large proportion of REM [26.5 + 
5.0% (mean + SD) in total sleep time over- 
all and 29.1 + 4.3% of at-sea total sleep time; 
table $3]. This compares to 11%, 6%, 5%, and 
1% in aquatic sleep for captive Caspian seals, 
harp seals, walruses, and fur seals, respectively 
[(22, 24-27); see the materials and methods). 

Our at-sea deployments occurred during late 
spring, when juvenile seal aggregations at- 
tract predators (14). While transiting over the 
continental shelf, juvenile seals alter their 
swimming behavior to avoid predation (28). 
Unexpectedly, we found that seals slept pro- 
portionally more on the continental shelf than 
in the open ocean (Figs. 2 and 3A). One seal 
performed up to 36 consecutive sleeping dives 
on the continental shelf but fewer than five at 
sea. This suggests that seals can safely sleep at 
depth despite elevated coastal predation risk. 


Finding time to sleep at sea 


Without unihemispheric sleep allowing con- 
tinuous vigilance, seals are vulnerable and 
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stroke rate [strokes per minute (spm)], heart rate [beats per minute (bpm)], logger placement. (F) Schematic demonstrating placement of electrodes for 
left (L) EEG (uV), right (R) EEG (uV), L EEG spectrogram [power (dB) electrooculogram (EOG), EEG, ECG, and electromyogram (EMG). (G) 3D 


for frequency (Hz) over time], pitch (radians), roll (absolute value; radians) dive profile color-coded by sleep state: Active waking is shown in dark 
and heading (radians), and time (minutes of dive). (B to D) Raw EEG and blue, quiet waking in light blue, light SWS in light green, deep SWS in teal, 
ECG signals during the transition to light SWS (B), deep SWS (C), and and REM in yellow. (H) Top view of sleep spiral. (1) Depth over time 

REM (D). During SWS, high-voltage, low-frequency slow waves are present. showing nested durations of gliding, electrophysiological sleep, constant 
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Fig. 2. Sleep patterns from land to sea. (A) Daily sleep quotas for seals in the supplementary materials). (B) Schematic showing the resting postures of seals in 
laboratory (on land and in the pool) and in the wild (on land, in shallow water, each habitat, including seals resting on the ocean floor on the continental shelf and 
on the continental shelf, and in the open ocean), including active waking (dark blue), drifting in the open ocean. (C) 2D map with bathymetry showing georeferenced 
calm (lighter blue), drowsiness (purple), REM sleep (yellow), and SWS (light blue). dead-reckoned tracks for three animals recorded at sea. (D) 3D map demonstrating 
REM sleep totals include certain and putative REM (see “REM scoring” section in the —— sleeping dive sequence, including the sleeping dive from Fig. 1. 
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demonstrate sleep identification accuracy (false positives in blue, false negatives 
in yellow, and true positives in green). Panels (A2) and (B2) display EEG 
spectrograms and heart rate for two adjacent sleeping dives. Panels (A3) and 
(B3) quantify daily activity budgets (or provide estimates) in hours per day 

of diving, sleep estimates (upper bound — unfiltered sleep ID; best estimate/ 
lower bound includes only sleep ID segments that meet filter criteria), gliding 
(long glides >200 s), sleeping (both SWS and REM), and REM sleep. 


Fig. 3. Sleep identification model performance. Time-depth records for two 
juvenile seals (A and B) are colored to indicate surface intervals (light blue), 
dives (dark blue), glides (blue), SWS (green), and REM sleep (yellow). In panels 
(Al) and (B1), the identified sleep segments are denoted below the dive 

profile, where outlined dots at the beginning and end of sleep segments are 
colored from yellow to dark blue according to overlap with sleep (“percent nap 
overlap”). Light shaded regions above the dive profile in panels (Al) and (B1) 
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Fig. 4. Estimating daily sleep for 334 adult females. (A) Map showing estimated 
sleep (hours per day) for two of 15 seals instrumented with stroke rate loggers 
(basemap matches bathymetry legend in Fig. 2). Each circle represents 1 day, 

with circle size and color reflecting daily sleep (hours per day). (B1 and B2) Time- 
depth records for two 24-hour days far from shore (Bl) and close to Vancouver 
Island (B2), demonstrating the difference in sleep during pelagic versus coastal 
foraging. Yellow dots and green shaded regions above dive profiles indicate identified 
sleep segments that 100% overlap with a long glide segment (yellow and blue 
shading represent false negatives and false positives, respectively). (C) Map with 
spatially averaged daily sleep estimates clipped to the extent of tracking data 

(1 point per seal day) across 342 good-quality tracks from 267 adult females. Note 


unable to actively transit during sleep to max- 
imize foraging efficiency. Between heightened 
predation risk and lost foraging opportunities, 
we expected sleep to be strongly restricted at 
sea. Supporting this hypothesis, we discovered 
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that seals’ daily sleep time was >5 times higher 
on land than at sea (Fig. 2). Seals slept up to 
14 hours/day on land [10.8 + 3.0 hours/day 
(mean + SD)] but as little as O hours/day at 
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sea (1.7 + 0.7 hours/day; tables S3 and S4). 
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the higher sleep time along the coast and foraging grounds. (D) Daily activity 


oraging trips (n = 164 seals) including surface intervals, 


diving, long glides, long drifts, sleep estimates (filtered long drifts), and long 
surface intervals. Sleep estimates demonstrate low sleep time throughout the trip. 
(E) Comparative figure showing total sleep time in terrestrial mammalian 
carnivores, omnivores, and herbivores [reprinted with permission (1)]. Extremes 
of sleep time on land and at sea from EEG recordings in juveniles (Fig. 2) are 
plotted for comparison. Sleep durations for other mammals are based on behavior 
and/or EEG in the laboratory and/or wild. Differences in recording location 


dd sleep identification technique (EEG versus behavior) 


complicate sleep quantification and direct comparison. 


After returning from 2 to 3 days at sea, seals 
remained on land for 18 to 43 hours, sleeping 
up to 53.3% of each hour before returning to 
shallow water (fig. S4). This moderate sleep 
rebound was comparable to the daily patterns 
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of other seals at the colony (fig. $4), suggesting 
that this relatively short multiday trip did not 
incur a notable sleep debt. 

Substantial fluctuations in sleep duration 
allow birds to prioritize migration and breed- 
ing for several days (5, 10). As mammals, ele- 
phant seals similarly partition strategies over 
long time scales, sacrificing sleep at sea to sup- 
port the energy requirements associated with 
their large size and deferring sleep until they 
leave the water and are on land with no preda- 
tors. Seals in laboratory settings show modest 
differences between total sleep time on land and 
water (24, 25). Here, we demonstrate greater ex- 
tremes in total sleep time for wild seals that 
are necessary to balance sleep with the need to 
replenish the energy stores of a large, highly 
mobile predator at sea. 


Mapping range-wide sleep patterns at the 
population level 


Using the paired electrophysiological and 
time-depth signatures of SWS and REM sleep 
from instrumented seals (Figs. 1 and 2), we 
developed a high-accuracy sleep identifica- 
tion algorithm that identified segments of 
inactivity characterized by low vertical speed 
and acceleration from time-depth data (93% 
accuracy; Fig. 3 and figs. S5 to S7). This algo- 
rithm allowed us to estimate sleep quotas for 
334 adult seals from their diving data recorded 
over several months at sea [7 = 170 short trips 
(74.6 + 9.5 days) and n = 164 long trips (217.7 + 
24.7 days)] (Fig. 4). These analyses indicated 
that daily sleep quotas were likely to be uni- 
versally low (1.1 + 1.1 hours/day for short 70-day 
trips and 2.2 + 1.6 hours/day for >200-day trips) 
(Fig. 4). 

Expanding this analysis to the population 
level, we can map range-wide sleep patterns 
to identify critical habitats for protecting wild 
seals while they sleep at sea. These “sleep- 
scapes,” which are based on 342 foraging trips 
by seals across the North Pacific (Fig. 4C), re- 
veal the same unexpected sleep patterns as in 
juvenile EEG records. That is, seals slept more 
while closer to the coastline despite greater 
predation risk (Fig. 4B1) (74). Because coastal 
foragers consume fewer, larger prey (17), our 
findings suggest that these seals must either 
expend more energy hunting for larger prey or 
require more time to rest and process such 
prey. Although the coastal water column may 
harbor more predators, the continental shelf 
may also facilitate sleep by providing shelter 
from predators and relative proximity to the 
surface. These findings and the resulting sleep- 
scape aid in identifying critical habitats that 
may guide coastal conservation efforts for wild 
animals. 

By connecting locomotion with different 
forms of sleep (SWS versus REM) in northern 
elephant seals, the present study provides 
conclusive evidence of sleep during drift dives 
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(23, 29, 30). Furthermore, these unique record- 
ings of brain activity for a wild, free-ranging 
marine mammal at sea show that sleeping at 
depth allows seals to drift safely in and out of 
sleep paralysis. However, these respites are 
short, because the large body size [456 to 687 kg 
(min-max adult female arrival breeding mass); 
(3D) of this elite diver that forages and sleeps in 
the dark must be sustained by near-constant 
foraging at sea. Sleep patterns interpreted 
from the dive records of hundreds of seals re- 
vealed only 2 hours/day of sleep for months, 
rivaling the record for the least sleep among 
mammals [the African elephant at 2 hours 
per day; (32)]. Both this method (applying 
sleep signatures from a small sample to reveal 
population-level patterns) and these find- 
ings (a detailed understanding of sleep for a 
highly mobile, large mammal) provide oppor- 
tunities for understanding sleep’s function, 
evolution, and pathology across mammals, 
including in humans. 
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PROTEIN DESIGN 


Top-down design of protein architectures with 


reinforcement learning 


Isaac D. Lutz'?*+, Shunzhi Wang’?+*, Christoffer Norn’2++, Alexis Courbet'>, Andrew J. Borst, 
Yan Ting Zhao"*’, Annie Dosey'”, Longxing Cao’, Jinwei Xu*, Elizabeth M. Leaf, 

Catherine Treichel"”, Patrisia Litvicov"®, Zhe Li+?, Alexander D. Goodson’, Paula Rivera-Sanchez’, 
Ana-Maria Bratovianu*, Minkyung Baek, Neil P. King”, 


Hannele Ruohola-Baker"*"”, David Baker>?3* 


As a result of evolutionary selection, the subunits of naturally occurring protein assemblies often fit 
together with substantial shape complementarity to generate architectures optimal for function in a 
manner not achievable by current design approaches. We describe a “top-down” reinforcement learning— 
based design approach that solves this problem using Monte Carlo tree search to sample protein 
conformers in the context of an overall architecture and specified functional constraints. Cryo—electron 
microscopy structures of the designed disk-shaped nanopores and ultracompact icosahedra are very 
close to the computational models. The icosohedra enable very-high-density display of immunogens and 
signaling molecules, which potentiates vaccine response and angiogenesis induction. Our approach 
enables the top-down design of complex protein nanomaterials with desired system properties and 
demonstrates the power of reinforcement learning in protein design. 


ultisubunit protein assemblies play 

critical roles in biology and are the 

result of evolutionary selection for func- 

tion of the entire assembly. Therefore, 

the subunits in structures such as icosa- 
hedral viral capsids often fit together almost 
perfectly (7, 2). In contrast to direct evolu- 
tionary selection on overall system properties, 
de novo protein design has generated protein 
architectures using a “bottom-up” hierarchical 
approach (Fig. 1A, left) in which monomeric 
structures are first docked into symmetric 
oligomers (3-6) and then assembled into 
closed assemblies with tetrahedral, octahe- 
dral, or icosahedral symmetry (7-14) or open 
assemblies such as two-dimensional (2D) lay- 
ers and 3D crystals (15-19). An advantage of 
this hierarchical approach is that the multi- 
ple interfaces that stabilize the assembly can 
be validated independently (the first by char- 
acterization of the symmetric oligomer and 
the second by characterization of the nano- 
material assembly from the preformed oligo- 
mer), considerably increasing the robustness 
of the overall design process. Although such 
designed assemblies are already proving use- 
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ful for biomedicine in immunobiology and 
other areas, as highlighted by the recent ap- 
proval of a de novo-designed COVID vaccine 
(20-23), the bottom-up approach does have 
limitations. The properties of the assembly 
are limited to what can be generated from the 
available oligomeric building blocks, at least 
one of the subunit-subunit interfaces must be 
strong enough to stabilize a cyclic oligomeric 
substructure in isolation, and, more generally, 
there is no way to directly optimize the prop- 
erties of the overall assembly. 

We sought to overcome the limitations of 
bottom-up protein complex design by devel- 
oping a top-down approach (Fig. 1A, right) 
that starts from a specification of the desired 
properties (overall symmetry, porosity, etc.) of 
the structure and systematically builds up 
subunits that pack together to optimize these 
properties. We reasoned that protein frag- 
ment assembly (24-28), which can generate 
a wide variety of monomeric protein struc- 
tures, could provide a suitable mechanism 
for generating diversity. Previous design ap- 
proaches such as SEWING have built up pro- 
teins from fragments, optimizing for monomer 
stability at each step (29), but we aimed in- 
stead to optimize for overall system properties, 
which could involve trading off monomer 
stability for increased subunit-subunit inter- 
action strength and other properties. To en- 
able such end state-based optimization, we 
turned to reinforcement learning (RL), which 
has achieved considerable success recently in 
different fields of artificial intelligence, such 
as self-driving cars (30), the AlphaGo pro- 
gram that defeats top human players in the 
game of Go (31, 32), and algorithm develop- 
ment (33). Monte Carlo tree search (MCTS) 
(34, 35) is an RL algorithm that finds optimal 


4 


series of choices within a search tree. In M' ings 


choices are selected randomly at each bra. 
point to find a path down the tree, and after ex- 
ploring a path, the state is evaluated, and prob- 
abilities at each branch point back-propagated 
up the tree are reweighted accordingly such 
that subsequent iterations are more likely to 
lead to optimal paths. 


Backbone sampling by MCTS 


We sought to develop a MCTS algorithm for 
generating protein complexes that builds up 
the monomeric subunits from protein frag- 
ments directly optimizing for prespecified 
global structural properties. We set up the 
tree search such that at each step in the tree, 
a short protein fragment is appended at either 
the N terminus or C terminus of the growing 
chain. The number of fragments to consider at 
each step is a trade-off between the rapidity of 
learning (with a smaller number, weights on 
each choice can be learned more quickly) and 
the total diversity of structures that can be 
generated (which increases with the number 
of choices at each step). We chose to balance 
these factors by using as building blocks para- 
metrically generated straight helices, which 
are fully described by a single parameter (the 
length, which we allow to vary from nine to 
22 residues), followed by short loops clus- 
tered into 316 bins (derived from clustering 
loops in a large helical protein database; see 
the materials and methods). The search be- 
gins with the selection of one of the helix 
possibilities and then alternates between the 
addition of a loop or a helix choice at either 
terminus. Once a loop bin is chosen, we select 
randomly from the closely related loop back- 
bones within the cluster (Fig. 1B, left). Although 
this is a far narrower set of local structures 
than observed in native protein structures, 
we found in preliminary explorations that a 
wide variety of compact protein shapes could 
be readily generated from such building blocks. 
Building up a 100-residue protein backbone 
with this approach requires about five helix 
and four loop additions, yielding a total 
number of possibilities of ~1 x 10’’, with 
additional structural diversity from the var- 
iation in loop backbones within a bin. The 
size of the search tree grows exponentially 
with the number of structural elements, so 
the space of possibilities is more effectively 
explored for monomers with fewer helices 
than for larger monomers. 

The search is modulated based on the spe- 
cific problem specification through geometric 
constraints that are applied at each step in 
the search tree and score functions that are 
evaluated only after full structures are com- 
pleted. Potential moves consisting of helix 
or loop fragments are selected at each level 
of the search tree only if they pass geometric 
constraints that can be evaluated before the 
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Fig. 1. Top-down design strategy and computationa 
(left) and top-down 
MCTS architecture for monomer backbone generation. 
a helix stub is initialized at a random rigid body start 
configurations of he 


is checked against a set of predefined geometric cons 


expansion stage and then updates probabilities P;' afterward. Upon successful 
ch tree, the monomer is evaluated by score functions 


completion of a sea 


assembly of the entire structure; these include 
internal clashes and overall shape constraints 
(see the materials and methods for a full list 
of geometric constraints). Upon selection of a 
move passing the geometric constraints, its 
probability is upweighted, as are the proba- 
bilities of all prior moves leading to this point 
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right) strategies to protein assembly design. (B) (Left) 


ices and loops are sampled and constructed sequentially 
with probability P; stored in each edge to build the search tree. Each move 


| pipeline. (A) Bottom-up 


During each simulation, 
position, and different 


traints during the 


in the search tree. Completed backbones are 
evaluated using score functions that assess 
how well the overall generated structure sat- 
isfies the user specification of the problem to 
be solved (Fig. 1C and materials and meth- 
ods), and the probabilities of selection of each 
move at each step along the search tree are 


and probabilities P;"" are back-propagated to update all of the search tree edges. 
(Right) Symmetric transformations are applied to build an icosahedral capsid in 
parallel with monomers using the MCTS generative algorithm. (C) Concurrent 
geometric check (left) is performed at every step of the expansion stage and the 
search tree is terminated if there are violations. Final evaluation (right) with a 
series of score functions is performed upon completion of a simulation for 
monomers and assemblies. (D) In silico RL-generated capsids (blue) occupy a 
distinct structural space compared with de novo-designed protein cages (red) 
and natural capsids (green). 


reweighted accordingly. As individual move 
weights become increasingly biased after many 
traversals through the search tree, the gen- 
erated complete backbones have higher and 
higher scores (fig. S1). Because each itera- 
tion takes on average only tens of millisec- 
onds, high-scoring backbones can be sampled 
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at scale by searching over tens of thousands 
of iterations. To address the classical RL prob- 
lem of balancing exploration with exploita- 
tion (30-32), the search is initialized from 
many independent trees, and the maximum 
probability of any one move is capped (see 
the materials and methods). 

We first tested the MCTS approach in silico 
at the protein monomer level, choosing as a 
test problem the generation of protein back- 
bones with arbitrarily prespecified overall 
shapes. To our knowledge, there are no cur- 
rent approaches for addressing this problem. 
A specified build volume is represented on a 
grid, and the MCTS is initialized randomly 
within the volume. At each move, only addi- 
tions that stay within the specified volume are 
accepted. For a range of prescribed shapes, 
including regular polyhedra and letters from 
the alphabet, the ensembles of generated struc- 
tures closely fill the specified volumes, and indi- 
vidual backbones have the prespecified shapes 
(fig. S2). The average sequence length of the 
solutions increases through the optimization 
as the choices of moves and combinations of 
moves that lead to satisfaction of the input 
constraints are learned, enabling traversal fur- 
ther down the search tree (fig. S1A). 

We next sought to generalize the MCTS to 
the design of symmetric nanomaterials by ap- 
plying symmetry operators to generate assem- 
blies with the desired symmetry at each step 
in the search tree. Each move (helix or loop 
addition) is assessed by considering not only 
the growing monomer, but also its interac- 
tions with all nearby symmetry mates, com- 
puted using transformation matrices specifying 
each symmetry operator; moves that introduce 
steric clashes are discarded (Fig. 1, B and C). We 
tested these capabilities in silico by designing 
cyclic assemblies with symmetries C5 through 
C12, as well as tetrahedral, octahedral, icosahe- 
dral, and quasisymmetric icosahedral assem- 
blies of up to 240 subunits (figs. S3 and S4). 
We found that by providing different geomet- 
ric constraints and score functions to guide 
the search, we could control properties such 
as shape, size, porosity, and termini position 
from the top down (figs. S3 to S6). 


Nanopore construction using constrained 
symmetric MCTS 


As a first experimental test of the MCTS ap- 
proach, we applied it to the highly constrained 
design challenge of filling the space between 
two previously designed cyclic protein rings 
(6, 36) to generate disk-shaped structures with 
a central nanopore (Fig. 2A). Filling this sub- 
stantial but irregularly shaped space such 
that there are no large voids between the two 
rings is not straightforward with previously 
described protein design methods. We ap- 
proached this challenge with MCTS by geo- 
metrically constraining the search to the space 
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between the two rings, requiring dense pack- 
ing such that the only large void in the re- 
sulting assembly is the pore of the inner C6 
ring. Both the inner and the outer ring have 
C6 symmetry, and the search tree was initial- 
ized to start at the N termini of the outer ring 
and simultaneously build six subunits that 
collectively fill the empty space. We performed 
the MCTS for each of 2000 placements of a 
set of different inner rings with a range of 
inner pore sizes inside a constant outer ring 
(for each inner ring, we sampled rotations 
around and translations along the common 
cyclic symmetry axis). We selected backbones 
that fully filled the space between the two 
rings, designed sequences with ProteinMPNN 
(37), and selected for experimental character- 
ization 32 designs predicted to assemble into 
the designed assemblies by AlphaFold (AF) 
(38). Of these, we found that 28 were soluble 
and could be purified and 11 formed particles 
with the expected size and shape by negative- 
stain electron microscopy (nsEM). nsEM 3D 
reconstructions for two designs had an over- 
all shape closely consistent with that of the 
design models (Fig. 2B; some C7 2D class aver- 
ages were also obtained; fig. S7). We obtained a 
cryo-electron microscopy (cryo-EM) map of a 
third design at 5.1-A resolution and found it to 
be closely consistent with the design model: 
The alpha helices of the model are clearly 
within the contours of the density (Fig. 2C and 
fig. S8). The MCTS solution effectively sat- 
isfies the design criteria: The space between 
the two original rings is completely filled in, 
generating a disk-like structure with a narrow 
circular pore in the center. We are not aware 
of any previously designed or naturally oc- 
curring proteins that have this overall shape, 
which could be very useful for downstream 
nanopore-based sensing applications. More 
generally, these results demonstrate that the 
MCTS approach can solve highly constrained 
protein design problems. 


Top-down design of mini-icosahedra 


We next explored the use of MCTS to gener- 
ate icosahedral assemblies by using 59 trans- 
formation matrices to compute symmetry 
mates for a growing monomer. We sought 
to design very small, closely packed capsids 
inaccessible by other design methods, and 
developed geometric constraints and score 
functions to specifically favor such struc- 
tures (Fig. 1 and materials and methods). The 
end state-based score functions include mea- 
sures of cage porosity and interface desig- 
nability, as well as external placement of at 
least one terminus to enable fusion constructs 
(Fig. 1C). Given a specification of the length 
and number of helices in the monomer and 
the size of the overall assembly, we initial- 
ized millions of MCTS trajectories starting 
from a short helical fragment randomly placed 


within a specified upper distance bound of the 
origin in a random orientation and performed 
10,000 iterations for each to generate a large 
set of diverse structures. The MCTS generated 
closely packed icosahedral assemblies in silico, 
which span a structural space distinct from 
that of native and previous de novo icosahe- 
dra, with shorter sequence lengths than any 
previously described protein icosahedra and 
porosities comparable to the densely packed 
capsids generated by evolution (Fig. 1D). 

The MCTS method rapidly generates tens 
of thousands of candidate icosahedral assem- 
blies, and we experimented with approaches 
for rapidly designing sequences that stabilize 
these assemblies in a manner compatible with 
our overall top-down approach. In previous 
bottom-up nanocage design studies, the se- 
quences and backbones of the oligomeric 
building blocks are pre-optimized, so only the 
new interface formed between the building 
blocks in the cage is designed, and the over- 
all backbone is kept largely fixed (71). By 
contrast, with the top-down MCTS approach, 
the entire sequence must be designed, with 
backbone relaxation to optimize sequence- 
structure compatibility both within and be- 
tween the monomers and to increase interface 
shape complementarity. A deep neural net- 
work trained to learn the sequence and struc- 
ture relationships of native proteins was used 
to generate amino acid sequence profiles 
for each position in the newly generated back- 
bones, which were used in turn to bias amino 
acid selection in the sequence design stage 
using Rosetta design (materials and meth- 
ods and figs. S9 to S15). The resulting de- 
signs were filtered on the basis of interface 
contact molecular surface area (38), shape 
complementarity, predicted binding energy, 
exposed surface hydrophobicity, and AF (39) 
prediction similarity to the design model (see 
the materials and methods). The rigid body and 
internal degrees of freedom of the selected 
icosahedral assemblies were then optimized 
by Rosetta symmetric relaxation (40, 41), start- 
ing from both the Rosetta design model of the 
assembly and the AF-predicted structure of the 
monomer mapped back onto the assembly. To 
further increase sequence-structure compatibil- 
ity, we repeated this design-predict-relax cycle 
three times, at each iteration performing se- 
quence design on the full assemblies generated 
in the previous iteration, mapping back the 
predicted monomer structures into the assem- 
blies, and relaxing the full structure in Rosetta. 
We applied this sequence design and backbone 
refinement procedure to 220,000 of the MCTS- 
generated backbones and selected 368 designs 
for experimental characterization (detailed fil- 
tering processes are described in the materials 
and methods and figs. S11 to S13). 

Linear gene fragments encoding each design 
with hexahistidine purification tags were cloned 
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Fig. 2. Disk-nanopore design with symmetric MCTS. (A) Schematic 
illustration of MCTS-based sampling to build space-filling connectors between 
two concentric rings to generate disk-like structures with different nanopore 
inner diameters. The inner ring was placed in the center of a host outer ring, 
varying the rotation and vertical offset, which generates different void volumes 
(teal; middle panel above arrows). MCTS was then performed to densely fill these 
void volumes (blue). (B) Design models (left column) and nsEM 3D ab initio 


into an Escherichia coli expression vector, and 
the proteins produced in FE. coli in a 96-well 
format were purified by immobilized metal 
affinity chromatography (IMAC) pull-down. A 
total of 208 of the 368 designs were expressed 
and soluble as assessed by SDS-polyacrylamide 
gel electrophoresis. To evaluate particle for- 
mation, we performed nsEM on the IMAC 
elution fraction for each soluble sample. Two 
designs (RC_I_1 and RC_I_2, RL capsid with I 
symmetry, design 1 and 2) formed uniform 
particles with the expected size and shape 
(Fig. 3, A and B). Size-exclusion chromatog- 
raphy (SEC) of both designs yielded single 
peaks with an apparent molecular weight in 
the range expected for these assemblies (Fig. 3, 
C and D). The designed assemblies had the 
expected alpha-helical circular dichroism (CD) 
spectra and apparent melting temperatures 
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above 65°C. nsEM analysis showed that assem- 
bly morphologies were retained after 1 hour of 
treatment at 95°C and subsequent cooling to 
25°C (Fig. 3, F and H, and fig. S17). 

To evaluate the accuracy of our design strat- 
egy, we determined the structures of SEC- 
purified RC_I_1 and RC_I_2 capsid particles 
using cryo-EM (Fig. 4 and fig. $18). For RC_I_1, 
3D reconstruction yielded a 2.5-A-resolution 
cryo-EM atomic model that closely matched 
the computational design (Fig. 4, A and B, 
and fig. S19). The N-terminal helices of two 
monomers pack in an antiparallel fashion to 
form the primarily hydrophobic C2 interface, 
whereas the two helices near the C terminus 
form the C5 interface with their neighbors 
(Fig. 4, B and C). Small apertures (diameter 
~13 A) present at the C3 axes of the capsid 
make the N termini available for genetic fu- 


reconstruction maps (right column) of two connected disk-nanopores (RNR_C6_1 
and RNR_C6_2). The symmetric MCTS sampling built helices to connect the 
inner ring C terminus and outer ring N terminus (highlighted in red and blue, 
respectively, in the left column). (C) The cryo-EM map at 5.1-A resolution 

for design RNR_C6_3 viewed from the top, bottom, and side is very close to 
the design model, with a narrow circular pore in the center of an otherwise 
nonporous disk-like structure. 


sion (Fig. 4C). Over the designed monomer, 
the root mean square deviation (RMSD) be- 
tween the cryo-EM structure and the design 
model is 0.76 A (Fig. 4D); a single rotamer 
flip (Phe®*) and tilting of the C-terminal helix 
results in a slight expansion of the overall 
cage diameter, resulting in an RMSD over all 
60 subunits of 3.72 A (Fig. 4). For RC_I_2, the 
2.9-A cryo-EM structure of design RC_I_2 was 
even closer to the design model (Fig. 4, F and 
G, and fig. S20), with RMSDs at the C2 and C5 
interfaces of 0.66 and 0.27 A, respectively 
(Fig. 4H). The RC_I_2 monomer adopts the 
designed three-helical bundle fold with a 0.59-A 
RMSD to the design model (Fig. 41), and the 
overall assembly is almost identical to the 
design model with a 1.39-A RMSD over all 
60 subunits (Fig. 4J). The C2 interface is 
situated near the extended C terminus of the 
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Fig. 3. Experimental characterization of designed capsids RC_I_1 and RC_I_2. (A and B) Representative 
nsEM micrographs and reference-free 2D class averages (inset) for RC_I_1 (left) and RC_I_2 (right). Scale bar, 
200 nm. (C and D) A single peak was observed for each SEC elution profile near the expected elution 
volumes for the target complexes. (E and G) Capsid computational design models. (F and H) Circular 
dichroism spectra measured at different temperatures (°C). 


monomer, allowing for potential monomeric 
or dimeric genetic fusions. The C5 pentameric 
interface is mediated by interactions between 
the N-terminal helices, which point inward 
and enable functionalization of the interior of 
the capsid. With diameters of 13 and 10 nm for 
RC_I_1 and RC_I_2, respectively, and associ- 
ated monomer lengths of 67 and 54 residues, 
the designed mini-capsids are considerably 
smaller than most viral capsids. 


Applications of top-down-designed capsids 


The compact size and corresponding small 
exterior surface area of the designed parti- 
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cles enables the display of 60 or 120 copies 
of N- and/or C-terminal fused proteins with 
exceptionally high density: six or more times 
higher than previously designed icosahedral 
cages. We set out to explore whether this higher 
density could lead to greater biological effi- 
cacy in signaling and vaccine applications. 
We began by exploring the robustness of the 
designs to substantial sequence changes and 
to fusion of proteins to their outward-facing 
termini. 

To evaluate robustness to sequence changes, 
we used ProteinMPNN (37) to generate diverse 
sequences for the RC_I_1 capsid backbone, and 


the designs were filtered using the AF and 
Rosetta metrics described above. Two of six 
experimentally tested ProteinMPNN designs, 
RC_I_1-H9 and RC_I_1-H11 (the former de- 
signed by ProteinMPNN using the working 
capsid backbone Ca coordinates as input, the 
latter the idealized polyA backbone without 
any backbone optimization and relaxation), 
assembled into the designed I1 symmetric 
capsid as evidenced by IMAC, SEC, and nsEM. 
A 3-A cryo-EM structure of RC_I_1-H11 was 
almost identical to the design model, with a 
monomeric RMSD of 0.60 A (Fig. 5A and fig. 
$21) and a very low full-cage RMSD over all 
60 subunits of only 0.96 A. RC_I_1-H9 and 
RC_I_1-H11 have on average 46% sequence 
divergence from the parent capsid and 30% 
sequence difference from each other, includ- 
ing highly diverse interface residue selec- 
tions (fig. $22; for example, the errant Phe®’ 
of the parent capsid was redesigned to Glu’ 
in RC_I_1-H1U, likely accounting at least in 
part for the closer agreement of RC_I_1-11 
with the design model). These results dem- 
onstrate that the RL approach can generate 
directly designable protein backbone geome- 
tries with a high degree of accuracy. 

We evaluated the robustness of the designs 
to genetic fusion by fusing SpyTag, SpyCatcher 
(42), and green fluorescent protein (GFP) pro- 
teins to the RC_I_1-H11 capsid with an N- 
terminal (GGS),, linker (Fig. 5B and figs. $23 
to S25). In all cases, SEC elution profiles and 
nsEM micrographs showed monodisperse 
particles of the expected size and shape (see 
the materials and methods). The 2D class av- 
erages (inset) revealed spherical structures 
similar to that of the original icosahedral 
capsid, with additional density at the periphery 
of the particles, consistent with fused pro- 
teins connected to scaffolds through a flexible 
linker. Unlike a larger cage, nuclear localiza- 
tion sequence-tagged capsids fused to GFP 
are efficiently translocated into the nucleus, 
opening the door to nuclear delivery of high- 
valency protein and DNA-organizing constructs 
(fig. S26). 

To assess the efficacy of the designed cap- 
sids in activating cellular signaling pathways 
by clustering cell surface receptors, we fused 
60 copies of the angiopoietin 1 (Angl) F do- 
main (Fd), which binds the Tie2 receptor, to 
RC_I_1-H11 using SpyTag-SpyCatcher conjuga- 
tion (14, 18, 43) (see the materials and methods 
and fig. $27). We found that the F domain- 
displaying capsids had very high potency in 
driving FOXO1 exclusion from the nucleus 
(Fig. 5, C and D), activating the AKT pathway 
(Fig. 5D and fig. S28, A to C) and stabilizing 
nascent blood vessels formed from human um- 
bilical vein endothelial cells (HUVECs; Fig. 5E) 
(43-49). The Fd-displaying capsids (0.16 nM 
RC_I_1-H11-Fd) elicited stronger responses 
than a 10-fold greater concentration of a much 
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Fig. 4. Near-atomic resolution cryo-EM structures of designed capsids 
match design models. (A) A 2.5-A cryo-EM reconstruction of RC_I_1 viewed 
along the three symmetry axes. Scale bar, 20 A. (B) Cryo-EM structure of RC_I_1 
highlighting monomer packing and interfaces along each symmetry axis. 

(C) Overlay and RMSD calculations for RC_I_1 compared with the design model 
for each symmetry interface (cryo-EM is shown in blue; design is shown in gray). 
(D) Overlay and RMSD calculation for a single monomer of RC_I_l. (E) Overlay and 


larger F-domain-presenting icosahedral nano- 
particle (153-50) (12, 43); the elevated potency 
likely results from the higher surface display 
density [to facilitate comparison, concentra- 
tions at the bottom of Fig. 5D are in terms of 
Fd monomer (0.16 nM capsid x 60 Fd copies 
per capsid = 10 nM Fd)]. The 0.16 nM (10 nM 
Fd) capsid also elicited stronger responses 
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than 100 nM Ang!1. The F domain-displaying 
capsid is thus an exceptionally potent Tie2- 
activating ligand. The designed capsid is also 
far easier to produce and much more stable 
than Angi and thus could be useful in stim- 
ulating differentiation and regeneration. 
The high surface presentation density en- 
abled by the designed scaffolds provides a route 
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RMSD calculation for the entire 60-mer RC_I_1 capsid. (F) A 2.9-A cryo-EM 
reconstruction of RC_I_2 viewed along the three symmetry axes. Scale bar, 20 A. 
(G) Cryo-EM structure of RC_I_2 highlighting monomer packing and interfaces along 
each symmetry axis. (H) Overlay and RMSD calculations for RC_I_2 compared 
with the design model for each symmetry interface (cryoEM is shown in pink; design 
is shown in gray). (I) Overlay and RMSD calculation for a single monomer of 


SD calculation for the entire RC_I_2 capsid. 


to investigating the effect of packing density 
on the elicitation of immune responses by 
nanoparticle-based immunogens. As a first 
step in this direction, we fused trimeric in- 
fluenza hemagglutinin (HA) to the N termi- 
nus of I1-capsid RC_I_1 using a (GS)g linker. 
The fusion protein was expressed and sec- 
reted from mammalian cells and clearly forms 
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Fig. 5. Applications of designed capsids. (A) Robustness of RC_I_1 capsid to 
sequence redesign using ProteinMPNN. The 3-A-resolution cryo-EM reconstruc- 
tion of RC_I_1-H11 (top) reveals a close agreement between the experimental 
structure (purple) and the design model (gray) with a RMSD of 0.96 A. The 
RC_|_1-H11 structure is nearly identical to RC_I_1 despite considerable sequence 
differences [bottom; residue differences are highlighted in red (RC_I_1-H11) 

and teal (RC_I_1)]. (B) From top to bottom, models and representative nsEM 
images of spyTag-, spyCatcher-, and GFP-fused (to N terminus) RC_I_1-H11 with 
2D class averages. Scale bar, 50 nm. (C and D) RC_|_1-H11-Fd activates Tie2 
downstream Akt phosphorylation and FOXO1 translocation. Serum-starved 
HUVECs were treated with serially diluted RC_I_1-H11-Fd (1000-0.1 nM), Fd-st 
(100 nM), Angl (100 nM), 153-50 (100 nM), or phosphate-buffered saline (PBS) 
control for 15 min before protein lysate collection for Western blot analysis, or 
cells were fixed for FOXO1 antibody stain. (C) Left, representative confocal 
images of HUVECs immunofluorescence stained with FOXO1 antibody. Right, 
quantification showing the percentage of cells with nuclear FOXO1; 100 cells were 
counted in each biological replicate. Levels of significance were compared with 
PBS control in the FOXO1 graph. (D) Quantification of Western blot showing 
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pAKT signal normalized to RC_|_1-H11-Fd at 10 nM. RC_I_1-H11-Fd induces a 
significantly higher signal than the previously characterized 153-50-Fd (inset) at 
100 nM Fd equivalent. (E) Quantification of vascular stability by averaging the 
number of nodes, meshes, and tubes calculated at the 72-hour time point using the 
Angiogenesis Analyzer plug-in in ImageJ (fig. S28D). In (C) to (E), P values were 
calculated using one-way ANOVA with Bonferroni's multiple-comparisons test in 
Prism for comparing groups of two or more; *P < 0.05; **P < 0.01; ***P < 0.001; 
****P < 0.0001; significance over PBS control is noted as # in (D). (F) Representative 
nsEM micrograph and 2D class averages (inset) of mammalian cell secreted RC_I_1 
particle flexibly fused with M15 influenza HA (MI15-RC_I_1). Scale bar, 50 nm. 

(G) The RC_|_1 displayed HA is antigenically intact, reacting with both head (5J8) 
and stem (CR9114) anti-HA antibodies in biolayer interferometry experiments. 

(H) Models of RC_I_1 (top) and 153_dn5 (bottom) displaying MI15 influenza HA (left); 
the presentation is considerably denser in the former. Top right: Mouse immunization 
schedule. Bottom right: HA-specific antibody titers in immune sera. Statistical 
significance was determined using one-way ANOVA with Tukey's multiple- 
comparisons test; *P < 0.05; ****P < 0.0001. The RC_I_1 display format produces a 
higher antibody titer than the 153_dn5 nanoparticle currently in clinical trials. 


HA-displaying particles according to SEC and 
nsEM (fig. S29 and Fig. 5F). Biolayer interfer- 
ometry showed binding of both 5J8 [anti-HA 
head antibody (50)] and CR9114 [anti-HA 
stem antibody (51)] immunoglobulin G to 
HA capsids (Fig. 5G), indicating that the 
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HA remains antigenically intact when dis- 
played on the surface of the capsids. We im- 
munized mice with HA-displaying RC_I_1, as 
well as a much larger icosahedral immunogen, 
HA-I53_dn5 (52), which has previously been 
shown to elicit protective responses against 


influenza and is currently being evaluated in 
clinical trials (53). We found that HA-displaying 
RC_[1 elicited a strong antibody response 
against vaccine-matched HA that was greater 
than that produced by the clinical vaccine can- 
didate by a small but statistically significant 
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amount (Fig. 5H). These results indicate that 
the high antigen presentation density enabled 
by top-down design can yield robust immune 
responses. 


Conclusion 


Our top-down RL approach enables the solution 
of design challenges inaccessible to previous 
bottom-up design methods. Cryo-EM struc- 
tures confirm the design of 54- and 67-residue 
proteins that assemble into 60-subunit icosa- 
hedra with both internal monomer and over- 
all assembly structure nearly identical to the 
computational models, and of disk-shaped 
nanopores generated by densely filling the 
space between cyclic protein rings with differ- 
ent diameters. Both the icosahedra and the 
disk designs are distinct from any previously 
designed or naturally occurring structures; the 
former have smaller subunits, smaller radii, 
and lower porosities, and the latter have nar- 
row central pores within large, circular, other- 
wise nonporous structures. These structures 
could not have been built with previous bottom- 
up approaches. For the icosahedra, generating 
the shape complementarity of the interfaces 
requires the context of the full capsid struc- 
ture, possible only through a top-down ap- 
proach, and for the disks, densely filling a 
prescribed volume from preexisting building 
blocks is generally not possible. The density 
of protein chains and termini available for 
fusion to the icosahedra is considerably greater 
than the most compact previously designed 
assembly, enabling fusion to functional protein 
domains to generate bioactive nanoparticles. 
The Angi F domain-displaying capsids are 
potent activators of angiogenesis, and the in- 
fluenza HA-displaying capsids elicit strong 
anti-HA antibody responses in mice. The capa- 
bility of the MCTS approach to optimize any set 
of specified geometric criteria in a top-down 
fashion provides a route to potent, multivalent 
cellular receptor agonists and vaccines that are 
custom designed to rigidly scaffold immunogen 
or receptor-binding monomers and precisely 
position them relative to one another. More 
generally, our results demonstrate the power 
of RL for protein design, which we expect can 
be increased further by the incorporation of 
policy and value networks (30-32, 54) to fur- 
ther guide the search. 
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QUANTUM MECHANICS 


Schrodinger cat states of a 16-microgram 


mechanical oscillator 


Marius Bild'?+, Matteo Fadel7+*, Yu Yang’?+, Uwe von Liipke”, Phillip Martin®?, 


Alessandro Bruno*, Yiwen Chut2* 


According to quantum mechanics, a physical system can be in any linear superposition of its possible 
states. Although the validity of this principle is routinely validated for microscopic systems, it is still 
unclear why we do not observe macroscopic objects to be in superpositions of states that can be 
distinguished by some classical property. Here we demonstrate the preparation of a mechanical 
resonator in Schrédinger cat states of motion, where the ~10”” constituent atoms are in a superposition 
of two opposite-phase oscillations. We control the size and phase of the superpositions and investigate 
their decoherence dynamics. Our results offer the possibility of exploring the boundary between the 
quantum and classical worlds and may find applications in continuous-variable quantum information 
processing and metrology with mechanical resonators. 


uantum mechanics is one of the most 

successful scientific theories ever for- 

mulated. However, from the early days 

of quantum mechanics until now, it has 

been unclear why quantum phenomena, 
such as state superpositions, are never ob- 
served in the macroscopic world. In his 1935 
work (1), Erwin Schrodinger imagined a de- 
vice able to poison a cat as a consequence of a 
radioactive decay, concluding that the super- 
position of an atom being “decayed” and “not 
decayed” could be mapped onto a superposi- 
tion of the cat being simultaneously “dead” 
and “alive.” There are two aspects of this hy- 
pothetical scenario that make it seem absurd 
and counterintuitive: First, a cat is a macro- 
scopic, everyday object; and second, “dead” and 
“alive” are states that are mutually exclusive 
within our classical experience. 

Many explanations have been proposed as 
to why we may never encounter a cat in such 
an unfortunate situation. Macroscopic objects 
may simply be too complex and subject to too 
many sources of decoherence to sustain a super- 
position of classically distinct states. Other 
theories introduce additional effects beyond 
standard quantum mechanics, such as wave 
function collapse due to intrinsic stochastic 
noise or gravitational decoherence (2). These 
effects are typically expected to scale with the 
mass of the system and the distinctness of the 
states that are superposed. Therefore, observ- 
ing state superpositions in massive objects is 
of key importance for exploring the validity 
range of quantum mechanics as we know it. 
Beyond its fundamental interest, preparing and 
detecting Schrédinger’s cat states is essential 
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for applications in quantum technologies. Main 
examples include Heisenberg-limited parame- 
ter estimation protocols (3, 4) and error-protected 
quantum information processing (5, 6). 
There have been many experimental dem- 
onstrations of Schrédinger cat states (which 
we will call “cat states” from here on). These 
include superpositions of internal and mo- 
tional degrees of freedom in trapped ions (7, 8), 
phase-space superpositions of electromagnetic 
waves in both the optical (9, 70) and microwave 
domains (77-13), Greenberger-Horne-Zeilinger 
states (14, 15), current superpositions in super- 
conducting quantum interference devices (16), 
and spatial superpositions of atomic clouds 
(7) and large molecules (18). We experimen- 
tally demonstrate the preparation of cat states 
in the motional degree of freedom of a solid- 
state mechanical resonator. Given the variety 
of definitions found in previous works, we 
define a cat state of a harmonic oscillator as 


Fig. 1. Illustration of the MBAR device and 
system evolution. (A) Schematics of the MBAR 
device. The HBAR chip (top) has a layer of piezo- 
electric aluminum nitride (orange) and supports 
standing acoustic waves (pink). The transmon qubit on 
the lower chip has a circular antenna to couple 

with the HBAR. The inset shows the superposition 

of two opposite-phase oscillations of atoms in the 
crystal lattice. (B) Simulated evolution without 
decoherence of the qubit |e) state population Pj.) and 
purity y under the JC interaction when the qubit is 
initialized in |—Z) and the phonon in a coherent 
state. (C) Illustration of the evolution of an initial 
phonon coherent state (red circle on the left) in phase 
space. The blue (yellow) crescent shapes indicate 
the state |®,.) (|®_)), which is the phonon state when 
the qubit is initialized in | +X) (|—X)). Interference 
fringes appear around time tc when the qubit is 
prepared in a superposition of |X) and |—X). 
Around the revival time tp, the two phonon states 
again overlap (purple). 


¢ 


a coherent superposition of two or more st ciel 
with well-separated phase-space distributiL- 


Experimental setup 


Our device, which we call a /BAR, as in pre- 
vious works (19), consists of a high-overtone 
bulk acoustic-wave resonator (HBAR) coupled 
to a superconducting transmon qubit. The 
transmon qubit allows us to create, control, 
and read out phonon states in the HBAR. 
Qubit and HBAR are fabricated on separate 
sapphire chips, which are subsequently flip 
chip bonded into the final device (Fig. 1A). A 
dome of piezoelectric aluminum nitride on the 
HBAR chip coherently couples the electric 
field of the qubit with the strain field of the 
resonator modes. The device is placed inside a 
three-dimensional Al cavity, which allows us 
to control the qubit with the standard circuit 
quantum electrodynamics toolbox (20). The 
acoustic free spectral range is ~12 MHz, and 
frequency tuning the qubit via a microwave 
drive-induced Stark shift allows us to address 
several longitudinal phononic modes. The 
acoustic lattice oscillations are localized within 
a Gaussian mode with waist wp = 27 um and 
length L = 435 um, giving a mode volume of 
nweL = 0.001 mm? [see supplementary text 
(21) section B for details]. More details about 
this circuit quantum acoustodynamics (cQAD) 
system (19, 22) and the device (23) can be 
found in previous works. 

In the classical picture, one can imagine a 
coherent state |) in the phonon mode as a 
coherent displacement of the atomic lattice 
with an amplitude proportional to a. In the 
quantum picture, an example of a cat state is 
a quantum superposition of two coherent 
states with opposite displacement amplitudes, 
leading to the physical interpretation of such 
a state as the superposition of two oscillations 
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of the atomic lattice with the same frequency 
@p and relative phase m. Considering a snap- 
shot in time where both oscillations are at 
their displacement maximum, Schrédinger’s 
cat being in a superposition of dead and alive 
is analogous to a superposition of atoms in 
the HBAR being in two distinct positions in 
space, as illustrated in the inset of Fig. 1A. 
Here, we define the positions as distinct when 
their separation is larger than the fluctua- 
tions due to quantum, thermal, or other sources 
of noise. 


Generating cat states 


To realize a cat state in our system, we use the 
Jaynes-Cummings (JC) interaction with the 
qubit and phonon on resonance (JI, 24). The in- 
teraction Hamiltonian is 


H/h=g(o'a+o-a’) (1) 


where go is the coupling strength between 
qubit and phonon mode, o* is the raising op- 
erator for the qubit, and a’ is the raising op- 
erator for the phonon mode. This results in 
Rabi oscillations between the states |e, — 1) 
and |g, 2) at a rate go/n, where |g) (|e)) is the 
qubit ground (excited) state and |7) is the Fock 
state of n phonons. As a consequence of this 
,/nscaling, if the phonon mode is prepared in 
a coherent state with large enough amplitude 
and the qubit is prepared in |g) or |e), their 
coherent interaction rapidly dephases. Hence, 
the oscillations of the qubit population “col- 
lapse” (Fig. 1B) with a decaying amplitude 
proportional to (25, 26) exp(—(t/tcotapse)”), 
where foottapse = J2 /o is the collapse time in 
the limit of o » 1. At this time, the qubit and 
phonon states are entangled. This can be seen 
in Fig. 1B as a minimum in the qubit state pu- 
rity y(t) = Tr(pq(t)”) around feotlapse, Where 
Pq(t) is the reduced density matrix of the 
qubit. Notably, owing to the quantized phonon 
energy and the consequent discrete oscilla- 
tion frequency spectrum, the oscillations revive 
in finite time (24). For a > 1, this revival occurs 
at tz = 2n0/80 (27). Between the collapse and 
revival, at time tg /2, the qubit and phonon 
disentangle from each other. The state being 
separable at tg/2 coincides with the occur- 
rence of a superposition of two distinct states 
in phase space, realizing a cat state in the pho- 
non mode (11, 21, 24, 28). 

A more intuitive explanation for the origin 
of the cat state comes from the time evolution 
of the reduced phonon state in phase space. 
As illustrated in Fig. 1C and shown in sup- 
plementary text section A, if the qubit is ini- 
tialized in the state |X) = (|e) + |g))/\/2 and 
in the limit of large a, the evolution leads to a 
rotation in phase space with an angular veloc- 
ity +|g0/2a| and a distortion of the coherent 
states. We call the resulting states |®,(¢)), whose 


full expressions are given in the supplemen- 


Bild et al., Science 380, 274-278 (2023) 21 April 2023 


Fig. 2. Collapse and revival 
dynamics. (A) Experimental 
sequence for observing collapse 


qubit 


and revivals dynamics and for phonon 
preparing cat states (details in the sneilia 
main text). (B) Measured qubit phonon 


population and state purity. The solid 
and dashed black lines are the 
simulation results of the qubit B 
population and purity, respectively. 
Three time points of particular 
interest are highlighted (dashed 
lines): initial state time, cat 

state time (tc), and revival state 
time (tp). (©) Measured Wigner 
function of the phonon state at the 
three time points. Axes are the 


measurement 


resonant 
interaction 


‘SWAP qubit 
init. 


real and imaginary parts of the 0 ; 4 6 8 
complex displacement amplitude B i “al time t (Us 
used during Wigner tomography Cc A a 
(23). The black crosses indicate the 2 
positions of the two coherent states a 1 
composing the fitted CSS state E 0 
Eq. 2) (D) Corresponding simulated -1 = 
Wigner functions. —2 = 
Ds F 
2 
4 
@ 9 1 
E | 
= 1 
- 
-—3-2-10 1 -2-10 1 -2-101 2 
Re(B) Re(B) Re(B) 


tary text (eq. $15). If the the qubit is initialized 
in the state |+Z) = (|+X) + |—X))/V2, the 
phonon state will evolve into |®,(t)) + |®_(¢)) 
as shown in Fig. 1C. At time fg /2, the two state 
components |®,.) have covered a rotation angle 
of 1/2 around a circle of radius o, maximiz- 
ing their separation in phase space and form- 
ing a cat state (24). Finally, at the revival time 
tg, the two phonon state components |®..) 
have both rotated by a phase of x and approx- 
imately recombine in phase space. 


Experimental results 


We experimentally confirm both the predicted 
collapse and revival of Rabi oscillations and 
the creation of mechanical cat states in the 
phonon mode. The basic sequence used in the 
experimental demonstration of the JC dynam- 
ics described above can be seen in Fig. 2A. We 
displace the phonon mode with a resonant 
drive of amplitude A to a coherent state with 
amplitude a. To mitigate any effect of the drive 
on the qubit state, we then cool the qubit with 
an ancillary phonon mode (22, 23). The qubit 
is subsequently prepared in its initial state by 
applying a drive pulse with variable phase and 
amplitude. To induce the resonant interaction, 
we tune the qubit to the phonon mode frequency 
for a variable interaction time ¢. Depending 
on which of the subsystems we want to char- 


acterize, we choose a measurement sequence 
that implements the appropriate measure- 
ment operator. First, we simply measure the 
qubit excited state population. The resulting 
data are shown in Fig. 2B for A = 0.35. Here 
the value for A is a scaling factor for the am- 
plitude of a microwave drive, which we cali- 
brate to find a corresponding initial coherent 
state size of a = 1.75 (supplementary text sec- 
tion E). As expected, we observe oscillations 
that collapse after a time foollapse * 0.9 us 
and revive attg ~ 6.7 us. This revival indicates 
the coherent exchange of energy quanta between 
the qubit and phonon mode during the resonant 
interaction. By performing full qubit tomog- 
raphy after the resonant interaction, we can 
also reconstruct the reduced density matrix 
of the qubit subsystem p, and calculate the 
purity of the qubit state y(t). We confirm a 
local minimum of y(t) around the predicted 
collapse time feotlapse, followed by a local max- 
imum around ¢p /2 (Fig. 2B). 

We now focus on the time evolution of the 
phonon subsystem by performing full Wigner 
tomography of the phonon state after the 
resonant interaction times ¢ = 0, 2.9 and 7.0 us. 
To this end, we use the parity measurement 
technique established in a previous work (23). 
To compensate for the effect of qubit dephasing 
during the parity measurement, we normalize 
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Fig. 3. Cat state amplitude A 
and phase control. (A) Cat 
states prepared with different 
displacement pulse amplitudes A. 
Top row: Measured Wigner 
functions; bottom row: analytical 
state p(tc) that best fits the 
data. The fitted CSS states, with 
coherent state positions indicated 
by black crosses, have D = 1.09 
(1.43) for A = 0.25 (0.30). (B) Cat 
state sizes as a function of the 
displacement amplitude A, obtained 
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all measured parity values to that of the Fock 
|0) phonon state (supplementary text section 
D). The measured Wigner functions are shown 
in Fig. 2C, where axes in phase space are nor- 
malized by measuring the distribution of pop- 
ulations in the phonon Fock states for coherent 
states created with different drive amplitudes 
(22) (supplementary text section E). 

From the measured data, we confirm the 
evolution of the initial coherent state (Fig. 2C, 
left) into a cat state at tc = 2.9 us (Fig. 2C, cen- 
ter), showing two state components clearly 
distinct in phase space and interference fringes 
located between them. We choose this value 
of tc because it corresponds to the measured 
maximum in the qubit state purity. It de- 
viates somewhat from the value predicted by 
using the large o limit, which is tp /2 = 3.3 us. 
For the evolution time t = 7.0 us, the predicted 
refocusing into a crescent-shaped overlap be- 
tween the counter-rotating state components 


can be observed (Fig. 2C, right). 
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To benchmark the cat state and obtain an 
estimate of its size, we implement a maximum- 
likelihood reconstruction (29) of the phonon 
state p, from the measured state with A = 0.35 
and tc = 2.9 us. We then fit the reconstructed 
state to an analytical expression of the expected 
phonon state p(tc) in the absence of decoher- 
ence and after tracing out the qubit (supple- 
mentary text section A). Fixing the interaction 
time to tc from the experiment, the fit max- 
imizes the fidelity between p, and p(tc) by 
varying the initial coherent state size ag, of 
p(tc). The result yields a fidelity of F = 76% 
to an analytical state with initial coherent 
state size of = 1.62, which is smaller than 
the initial displacement a = 1.75 because the 
expression for p(t) does not include phonon 
losses. We attribute the infidelity to a com- 
bination of decoherence and measurement 
imperfections that lead to additional artifacts 
in the Wigner function (23). To further con- 
firm that the phonon state behaves as ex- 


pected, Fig. 2D shows the results of a master 
equation simulation of the full experimental 
protocol with independently measured sys- 
tem parameters, showing good agreement 
with the measurements in Fig. 2C. 

The state we obtained resembles the two- 
component coherent state superpositions (CSS) 


|C) = N (len) + e” |a2)) (2) 


a type of cat state that is often invoked in 
quantum information (5, 6) and parameter 
estimation protocols (3, 4). Here |o1,2) are two 
coherent states and V is the appropriate nor- 
malization constant. We can fit our recon- 
structed state to Eq. 2 by optimizing oy, o2, 
and § for maximum fidelity F(p,, |C)(C|). Be- 
cause our state is not centered around the origin 
in phase space, we use half the phase space dis- 
tance between the coherent state components 
D = |oy — dg|/2 as a measure of the cat state 
size. This choice is motivated by considering a 
coherent state superposition centered around 
the origin in phase space, such that o; = —d». 
Then D = |oy2| = Vn, where 7 is the average 
phonon population of the state created. For the 
state in Fig. 2C, we obtain a state size D = 1.61, 
corresponding to z = D? = 2.60, with a fidel- 
ity of #=66%. The smaller state size D com- 
pared to the initial coherent displacement is 
a combination of decoherence and the choice 
of interaction time tc < tp /2, resulting in the 
two counter-rotating state components not 
reaching their maximum separation in phase 
space. The fidelity is lower compared to the 
fitted analytical state p(tc), because p(zc) itself 
has a finite infidelity to the CSS state |C). 

We can now translate the parameters of the 
measured cat state into physical properties of 
the phonon mode, such as the spatial separa- 
tion between atoms. A state size of D = 1.61 
corresponds to a maximal delocalization of 
7.0 - Xzpp, Where 2zpy is the zero point motion 
of an equivalent one-dimensional (1D) quan- 
tum harmonic oscillator (supplementary text 
section B). Because we are not considering a 
center-of-mass mode, there is some freedom in 
choosing &zpr, which is then associated with 
an effective oscillating mass of the mode. If we 
choose the root-mean-square (RMS) value of 
the atomic displacements, we find an effec- 
tive mass of MEMS — 16.2 ug, corresponding 
to ~10"" atoms, delocalized over a distance of 
2.1 x 10718 m (supplementary text section B). 

In applications such as bosonic encodings of 
a qubit state, full control over the phase and 
amplitude of the created cat state is required 
(5, 6, 30). In the following, we demonstrate this 
level of control in our experiment. By varying 
the amplitude A of the phonon displacement 
drive, we can control the amplitude of the initial 
coherent state and the size of the resulting cat 
state. For displacement amplitudes A = 0.25 
and 0.30, we create cat states with D = 1.09 
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Fig. 4. Decoherence of cat A 40 
states. (A) Measured 1D cuts 
through the interference fringes 
of the D = 1.43 cat state for 

a range of wait times between 
state creation and measure- 
ment. (B) Extracted negativities 
(squares) from each cut versus 
wait times for three cat state 
sizes, together with fitted expo- 
nential decays (solid lines). 0 
Both data and fitted curves are 
normalized to the fitted value 

at t = 0. (C) Characteristic 
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and 1.43, respectively (Fig. 3A). The fidelities 
of reconstructions of the measured states to 
both a CSS state and p(tc) are given in Fig. 
3B. The best fit p(tc) is plotted in the lower 
row of Fig. 3A, showing good qualitative agree- 
ment with the data. As before, the finite fidel- 
ities and lower-contrast fringes of the measured 
states as compared to the best-fit p(tc) arise 
mainly from decoherence of the state during 
measurement. 

In the two-component cat state encoding of 
a qubit, the six cardinal points of the Bloch 
sphere are given by two coherent states |o,) 
and |o2), along with their four superpositions 
with n/2 difference in the phase § (Eq. 2). We 
can prepare similar states by initializing the 
transmon qubit state in all six cardinal points 
|£X),|+Y), |+Z) of its Bloch sphere before per- 
forming the cat state generation protocol. The 
preparation of |+X) and |+Y) is calibrated 
using collapse and revival measurements as a 
function of the qubit drive phase (supplemen- 
tary text section A). The results for A = 0.35 
and tc = 2.10 us are shown in Fig. 3C. We ob- 
serve a distorted coherent state located on the 
upper (lower) half of phase space for the qubit 
initially in |£X). This separation in phase space 
is expected from the opposite rotation directions 
between the phonon states when the qubit is 
prepared in |+X) (Fig. 1C). The |+Y),|+Z) states 
then give rise to four cat states that differ in 
phase by 7/2, as can be observed in the phases 
of the interference fringes in Fig. 3 D. The ini- 
tial energy of the qubit, and thus of the total 
system, is not the same for all six scenarios, 
resulting in slightly different sizes for the cat 
states. In the limit of large cat state size, this 
difference becomes negligible, and the pho- 
non subspace maps onto that of the cat state 
encoding. 

Superposition states are nonclassical states 
that are notoriously prone to decoherence. We 
now investigate the quantum-to-classical tran- 
sition of different-sized cat states by letting 
them evolve freely for a varying wait time t 
before performing Wigner tomography. Dur- 
ing this evolution, the qubit is far detuned 


from the phonon mode. In particular, we fo- 
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cus on a Slice through the Wigner function’s 
interference fringes at Im(8) = 0, which high- 
lights the nonclassical features of the super- 
position. Figure 4A shows the time evolution 
of this slice for the D = 1.43 cat state. We ob- 
serve that the negative features disappear on a 
time scale much faster than Ba = 84 1s, the 
energy relaxation time of the phonon mode. 

As a measure for the nonclassicality of the 
state, we extract the-time dependent negativity 
(3D, defined as 8(t) = J(\W(B, t)|—W(B, t))aB. 
Here, W (, ¢) is the measured Wigner function 
of the cat state at time ¢, and the integration is 
over the 1D slice in phase space parameter- 
ized by the complex displacement amplitude 
B. Figure 4B shows the resulting 6(¢) for the 
three different cat state sizes of Fig. 3B. We 
fit each dataset to an exponential decay plus 
a constant offset. The offset in the measured 
Wigner values arises from the fact that our 
Wigner tomography is not performed in the 
ideal dispersive limit (23). The extracted decay 
time scales 1, are plotted in Fig. 4C. We show 
in section H of the supplementary text that, in 
the limit of large |a|, 5(t) decays exponentially 
with atime constant tear = 7?" /(2|a|”). How- 
ever, for small |a|, teat deviates from this ex- 
pression and is in fact dependent on properties 
of the exact state, such as the phase of the super- 
position. The data in Fig. 4C show the ex- 
pected qualitative behavior of faster-decaying 
negativity for larger-sized cat states, and we 
present a more detailed quantitative analysis 
in the supplementary text (27). 


Concluding remarks 


Our results show the generation of cat states 
in a microgram-mass solid-state mechanical 
mode using the tools of cCQAD and pave the 
way toward using such systems for tests of wave 
function collapse models (32). These tests would 
benefit from larger-sized cat states, resonators 
with higher masses (33), and longer phonon 
lifetimes. To facilitate comparison with other 
mechanical resonators and theoretical models 
that often consider center-of-mass motion, we 
note that the HBAR mode is a standing wave 
in which a half-wavelength section approxi- 


mates a center-of-mass mode where all atoms 
oscillate in the same direction. Such a section 
has a mass on the order of 30 ng, obtained by 
dividing the total effective mass of 16 ug by 
the longitudinal mode number of ~500. 

The maximum size of the cat state that we 
can prepare is currently limited by our device 
parameters, including both the qubit and pho- 
non decoherence rates. The latter is especially 
important given that, in general, the decoher- 
ence rate of the cat state is proportional to the 
square of the cat state size D. Furthermore, 
additional improvements to the properties of 
qubit and phonon resonator would enable al- 
ternative cat state generation protocols that 
can in principle lead to states with a higher 
fidelity to, for example, a CSS state (12, 13). We 
point out, however, that although CSS states 
represent a useful benchmark (because they 
have been extensively studied for applications 
such as quantum information and quantum 
metrology), many of their salient features 
are present already in the states that we have 
demonstrated. These include the phase-space 
separation of state components, which is im- 
portant for error protection of encoded qubits 
(30, 34, 35), and the presence of interference 
fringes with high Fisher information, which is 
useful for quantum-enhanced sensing (4, 36, 37). 
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Scaling information pathways in optical fibers 


by topological confinement 


Zelin Ma’, Poul Kristensen’, Siddharth Ramachandran’* 


Spatial mode-count scalability in optical fibers is of paramount importance for addressing the upcoming 
information-capacity crunch, reducing energy consumption per bit, and for enabling advanced quantum 
computing networks, but this scalability is severely limited by perturbative mode mixing. We show 

an alternative means of light guidance, in which light’s orbital angular momentum creates a centrifugal 
barrier for itself, thereby enabling low-loss transmission of light in a conventionally forbidden regime 
wherein the mode mixing can be naturally curtailed. This enables kilometer-length-scale transmission 
of a record ~50 low-loss modes with cross-talk as low as -45 decibels/kilometer and mode areas 

of ~800 square micrometers over a 130-nanometer telecommunications spectral window. This distinctive 
light-guidance regime promises to substantially increase the information content per photon for 


quantum or classical networks. 


ight transport with optical fibers in mul- 

tiple orthogonal dimensions of a pho- 

ton is of great utility for increasing the 

information capacity of classical com- 

munications networks, reducing energy 
consumption per bit, as well as for facilitating 
development of advanced quantum computing 
networks in high-dimensional Hilbert spaces 
CZ, 2). With degrees of freedom such as wave- 
length, amplitude, phase, and polarization 
having been exhausted, spatial multiplexing 
remains the last available dimension to be 
considered (3). Two primary approaches for 
exploiting space comprise the use of multi- 
mode fibers (MMFs), whose spatial modes 
are mutually orthogonal, and multicore fibers 
(MCFs), which consist of multiple single-mode 
waveguide cores. MCFs are appealing, given 
their deployment simplicity and backward 
compatibility, but their individual cores must 
be sufficiently spatially separated to avoid 
cross-talk, a requirement that limits channel- 
count scalability because reliability consid- 
erations place an upper limit on overall fiber 
dimensions (3). MMFs, by contrast, feature sev- 
eral benefits, such as high spatial efficiency— 
because all spatial modes occupy the same 
waveguiding core—and cost and efficiency 
savings through shared-pump optical ampli- 
fication (4). However, modes in conventional 
MMFs stochastically couple with each other 
because of ever-present ambient perturbations, 
a problem that can be addressed through the 
use of multi-input-multi-output digital signal 
processing (MIMO-DSP) in classical commu- 
nications systems (3), wherein mode mixing 
may actually be preferred. But this necessarily 
increases the complexity and power consump- 
tion of the receiver, and, crucially, this solution 
cannot be applied to single photons in quan- 
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tum networks. An ideal solution would be an 
MMF whose modes behave like the sepa- 
rated cores of an MCF, because it would 
deliver the benefits of both approaches. How- 
ever, MMFs designed to limit mode mixing 
over kilometer-length scales have achieved only 
up to 12 modes (5), and even so, with cross-talk 
considerably worse than those of MCFs. Fur- 
ther scalability has eluded the scientific com- 
munity for almost a decade. 

Here, we demonstrate a mechanism to trans- 
mit light over long lengths of optical fibers 
by exploiting a topological effect wherein the 
photon remains confined to a fiber as a result 
of a centrifugal barrier that light’s orbital an- 
gular momentum (OAM) creates for itself. This 
form of light guidance is forbidden by the prin- 
ciples of total internal reflection (TIR), the pri- 
mary means for light transport demonstrated 
by Colladon, Babinet, and Tyndall more than 
150 years ago. Not restricted by the TIR con- 
dition, this regime of light transport results in 
propagating eigenstates that are naturally im- 
mune to perturbative mode mixing. We validate 
the practical utility of these modes, hereafter 
called topologically confined modes (TCMs), 
by demonstrating light transmission over a 
record number of low-loss spatial modes with 
negligible mode coupling over a kilometer of 
fiber, featuring mode areas that are an order of 
magnitude greater than that available from 
single-mode fibers (SMFs) or MCFs, even when 
they are tightly bent. We show that these bene- 
ficial metrics can be obtained over multiple spec- 
tral bands in use for telecommunications today. 


Concept and demonstration 
of topological confinement 


A heuristic ray picture of the concept of topo- 
logical confinement is shown in Fig. 1A (6). It 
schematically depicts a step-index fiber com- 
prising a high-index core, 7,9, surrounded by a 
low-index cladding, 2,4, and three exemplary 
eigenmodes with similar effective indices, Neg < 


4 


Nq. These modes, by definition, do not sa| Chec 
the TIR condition, and hence are normall- eek 
pected to have high confinement loss because 
they radiate away from the waveguide; hence 
they acquired the common nomenclature of 
“cutoff” or “leaky” modes (7). However, these 
properties have a strong dependence on mode 
symmetry. Fiber eigenmodes are quantized by 
two transverse indices: LZ, denoting the azi- 
muthal index resulting in the mode carrying 
OAM, and a radial index, m (manifesting in 
oscillatory field amplitude in the radial direc- 
tion). Including polarization [left- or right- 
handed circular polarization (LCP or RCP)] 
orthogonality, a given |Z| comprises four 
modes: a pair of spin-orbit aligned (SOa) 
modes (+|Z|, LCP and -|Z|, RCP) with neg 
slightly different from a pair of spin-orbit 
antialigned (SOaa) modes (-|Z|, LCP and +|Z], 
RCP) (8). As Fig. 1A shows, the transverse 
momentum fis progressively more oriented 
in the azimuthal rather than the radial direc- 
tion as |Z| increases. Because the “escape” rate 
of a mode tends to increase as /y is progres- 
sively more radially oriented, it follows that 
the confinement loss of unbound, radiative 
modes decreases as the mode’s azimuthal in- 
dex increases. 

A more rigorous illustration of topological 
confinement is obtained by considering the 
waveguide eigenvalue equation (9) 


L? 
kr? 


d’F(r) 
dr? 


pow) 


oer kon? (r) 


where 7 is the radial coordinate, F(r) is the 
radial amplitude of the electrical field, n(7) is 
the refractive index profile of the fiber, Xp = 
2n/X is the free-space wave vector of light of 
wavelength A, and B (= Xo-Nege) is the prop- 
agation constant of an eigenmode. The attrac- 
tive potential that enables TIR-induced bound 
modes is n7(r), but in the presence of OAM, a 
modified effective potential may be defined as 


2 2 i 
Neentrifugal (7) = 2 (7) — Br (2) 


Figure 1B is a plot of this topologically mod- 
ified profile, Ncentrifugai(7), experienced by cut- 
off modes with three nonzero L’s, calculated 
at different wavelengths such that they have 
identical Neg < Ny. As is evident, cutoff modes 
encounter refractive index barriers (index 
trenches) of progressively greater magnitudes 
as |L| increases. Hence, although the mode is 
no longer strictly bound by TIR (so that it no 
longer experiences zero confinement loss), 
its confinement loss strongly depends on |L| 
because of the centrifugal barrier that light’s 
OAM creates for itself. 

Figure 1C shows the simulated confinement 
loss versus |Z| for modes with radial orders 
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Fig. 1. The principle of light transport with TCMs. (A) The propagation of 
three cutoff OAM modes with different L's but identical nes in an exemplary step- 
index fiber. Select OAM modes in this fiber are illustrated as spiral patterns for visual 
clarity, with the number of spiral arms equal to |L|. Confinement loss of these 
modes decreases as |L| increases because their transverse wave vectors, kr, 
become progressively more azimuthally oriented. (B) Topologically modulated 
refractive index profile for the three cutoff OAM modes shown in (A). Higher 
|L| induces a deeper index trench, creating a centrifugal barrier that prevents the 


m &€ [1,4] of a step-index fiber with n., - N= 
0.04 and core diameter of 60 um, using a 
standard-perfectly-matched-layer model (J0) 
(data points are simulated at different ’s such 
that Ne/2q is identical for all modes). Modes 
of low |Z|s, and modes of high m regardless 
of |L|, are highly lossy, justifying their long- 
hid nomenclature as “cutoff” or “leaky” modes, 
but for first radial order (7m = 1) modes of suf- 
ficiently high |Z], confinement loss decreases 
substantially, down to as low as ~10° dB/km. 
Hence, although TCMs with high |Z| and m = 1 
do not satisfy TIR, their confinement loss sug- 
gests that their behavior mirrors that of con- 
ventional TIR bound modes. 

Figure 1D shows the simulated confinement 
loss versus relative topological charge (Z - L,, 
where L, is the topological charge of the last 
bound mode, guided conventionally by TIR, 
before cutoff) at 1550 nm in five different step- 
index fibers of identical index contrasts (as in 
Fig. 1C), but with core diameters ranging from 
15 to 75 um. Also shown is the overall measured 
loss (~0.2 dB/km) for conventional transmis- 
sion fibers (e.g., SMF-28). The number of modes 
that have confinement loss substantially lower 
than SMF-28 loss increases with L - L,. This 
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yields a crucial design criterion for realizing 
fibers that scale mode count with TCMs. Any 
fiber with a discrete index step at its outer core 
boundary (required for the existence of a cen- 
trifugal barrier; Fig. 1B) that guides, through 
conventional TIR, sufficiently high-|Z| modes 
(high L,), has a large ensemble of even higher- 
|L| cutoff modes that also experience negligible 
confinement loss, and the number of available 
TCMs simply increases with TIR bound-mode 
count. This rule implies that TCMs exist in any 
highly multimoded waveguide, comprising either 
high index contrasts, large core sizes, or both. 

We tested this concept using a ring-core fiber 
with a ring outer diameter of 56.6 um because 
it is functionally similar to a step-index fiber 
(fig. S1) (17-13). Figure 2A shows the cutback 
loss measurements on all SOa and SOaa modes 
with m = 1, |Z| € [25,43]. Conventional TIR 
bound modes have an average loss of ~1.4 dB/km, 
whereas TCM losses are ~5.1 dB/km (ignoring 
high-loss outliers that are not included in mode 
counts discussed in the next section). As is evi- 
dent, TCM losses are orders of magnitude lower 
than conventional wisdom posits for cutoff 
modes, validating the idea that centrifugal 
barriers greatly aid light transmission for 


cutoff modes from leaking out. (€) Simulated confinement loss versus mode 
orders L and m in a step-index fiber with 60-um ring diameter. The neg's of all the 
modes are held 1% below n,. Insets are simulated modal images of m = 1 

and m = 4 modes. (D) Simulated confinement loss at 1550 nm versus relative 
OAM order L — L,, in five step-index fibers with the same index contrast but 
different core sizes. L, is the OAM order of the last TIR bound mode (L, = 7, 16, 
25, 35, and 45 for fiber core sizes of 15, 30, 45, 60, and 75 um, respectively). 
Dashed line represents the overall measured loss (0.2 dB/km) of SMF-28. 


modes that are supposed to be radiated away. 
Barring a few transition TCMs that interact 
with parasitic high-m modes (fig. S4), the mea- 
sured loss scales adiabatically, and only slight- 
ly, with mode order |L|. Moreover, we find that 
TCM losses decrease considerably with increas- 
ing fiber-draw tension (fig. S3). These findings 
suggest that interfacial scattering loss at the 
core-cladding boundary plays a dominant role in 
the loss measured in the current fiber (14, 15), 
and that further glass viscosity and drawing- 
parameter optimizations promise substantially 
lower losses for TCMs. Given that simulated 
confinement losses for the TCMs (Fig. 2A, black 
dashed curve) are as low as ~10~° dB/km, and 
that low-loss telecommunications-grade fi- 
bers with similar index contrasts and index 
gradients are regularly available (16), we expect 
manufacturing optimizations to substantially 
reduce TCM losses down to those of commer- 
cial high-index-contrast SMFs. 

This topological effect is analogous to the 
OAM-dependent, above-threshold ionization 
of electrons in atomic orbitals (17) and to the 
metastability of Feshbach molecules in high 
rotational states (18). The effect is also well 
known in the context of nuclear reactions (19). 
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Fig. 2. Guidance beyond cutoff A 
and natural distortion immunity of 30 


TCMs. (A) Experimental and simu- fe) 
lated confinement loss at 1550 nm. A 
(B) (Left) ne versus % for select 
modes. Solid colored lines are 
desired m = 1 modes of different |L|; 
dashed black lines represent 
undesired high-m modes. The solid 
black line shows the index of the 
silica cladding (conventional bound- 
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For light, centrifugal barriers increase the life- 
times of photons in whispering-gallery cavities 
(20, 21). 


Natural immunity to mode mixing 


The key benefit of this regime of light transport 
is not just in the availability of more eigenstates— 
that result could potentially have been achieved 
with larger waveguides or higher-index cores 
for conventional TIR bound modes; rather, it 
is that topological confinement radically changes 
the density of states of low-loss propagating 
modes in multimode fibers, with implications 
for modal fidelity, as demonstrated below. 

Figure 2B shows the simulated Meg versus A 
of select modes in the same ring-core fiber. 
Solid colored lines represent m = 1 modes of 
various |Z|, and the dashed black lines repre- 
sent high-m modes. It is immediately apparent 
that in a highly multimoded fiber, the different 
dispersion relationships for modes of different 
radial and azimuthal orders result in a plethora 
of degeneracies, making them especially sus- 
ceptible to mode mixing. This is evident from 
a transmission experiment (fig. S2A) over a 
480-m-long fiber, in which an attempt at send- 
ing a signal in a conventional TIR bound mode 
(L = 33, m = I) results in a completely distorted 
output mode image (because it is uncontrollably 
coupled with the degenerate L = 16, m = 4 mode). 
Such modal degeneracies are the primary rea- 
son behind the current lack of progress in 
scaling cross-talk-minimized mode counts using 
conventional TIR bound modes. 

By contrast, markedly distinct behavior is 
evident for TCMs (whose Meg lie below the 
silica cladding index, depicted as a solid black 
curve in Fig. 2B). Although the Z = 39, m =1 
mode is degenerate with an L = 24, m = 4 mode, 
the output image still shows a clear single ring. 
Modal degeneracy does not cause modal distor- 
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Fig. 3. Modal characterization of the TCM-supporting ring-core fiber at 1550 nm. (A) Parasitic mode 
power spectrum for L = 40, RCP launched mode measured by spatial interferometry. (Inset) Launched 
mode image at fiber output. (B) Time-of-flight trace of L = 40, RCP launched mode. (€) Parasitic power for 
inter-|L| and intra-|L| coupling. (D) Measured and simulated Ags at 1550 nm versus |L|. 


tions in this case because the two modes have 
orders-of-magnitude-different confinement losses 
(Fig. 1C), leading to frustration of phase-matched 
coupling (22) (Fig. 2C). As such, all TCMs (|Z| = 
34, m = 1) show clean measured modal output 
images even though some transition TCMs re- 
veal extra total loss (fig. S4). Hence, operation 
in the TCM regime allows scaling mode count 
while decreasing the modal density of states 
that cause unwanted perturbative mixing. 


Fiber properties and transmission characteristics 
We quantified the cross-talk of desired m = 1 
modes at 1550 nm over this 480-m-long fiber 
using two techniques: (i) spatial interferom- 
etry (23), which yields power coupled in all 
modes for a given launched mode, and (ii) 
impulse-response measurements (5), which 
help disambiguate fundamental cross-talk due 
to in-fiber coupling from “technical,” discrete 
cross-talk from multiplexers. The latter is of 
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Fig. 4. Information capacity potential of TCM-supporting ring-core 

fibers. (A) Transfer matrices of all the modes with RCP. The red markers 
along diagonal indicate good modes; deep-blue markers indicate unstable, 
mixed modes. The box enclosed by the white dashed line denotes transition 
between TIR bound modes and TCMs. (B) PER of all available modes with LCP 
and RCP. (C) Cross-talk versus 4 of modes with +|L|, RCP in S, C, and L 


minimal concern in our analysis because it 
could be substantially minimized by using 
compact, fiber-compatible integrated mode 
transformers (24) in practical applications. 
Figure 3A shows an exemplary spatial inter- 
ferometry measurement when the L = +40, 
RCP mode is launched. Parasitic power at the 
output primarily resides in two modes (sup- 
plementary text): in the L = -40, RCP mode at 
the —43-dB level, and in modes separated from 
the launched mode by AL = +1, at the -20-dB 
level. The former (intra-|Z| mode coupling) arises 
from in-fiber coupling between nearest neigh- 
bors (in Nez), Whereas the latter (inter-|L| mode 
coupling) arises from experimental mode-launch 
imperfections, as well as in-fiber, bend-induced 
coupling. A representative impulse-response mea- 
surement for the same launched mode reveals 
two distinct features of inter-|Z| coupling: dis- 
crete coupling resulting in a spike in temporal 
trace that is due to mode launching, and dis- 
tributed coupling between this spike from the 
LI = 39 mode and the main peak for the L = 40 
mode. The integrated power of this distributed 
coupling is found to be lower than the noise 
floor of the measurement (—46 dB). Figure 3C 
is concerned only with coupling of the funda- 
mental (in-fiber, distributed) kind and plots the 
parasitic power of these two main sources of 
mode coupling for launched modes of interest in 
this fiber (supplementary text). The first finding 
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of interest is that intra-|Z| coupling dominates 
over bend-induced inter-|Z| coupling, justify- 
ing using the intra-|Z| coupling to represent 
the cross-talk (supplementary text). The second, 
more notable finding is that TCMs are more 
resistant to mode mixing, featuring cross-talk 
< -40 dB/km for almost all TCMs, barring a few 
outliers primarily at the transition between 
bound and cutoff modes. Some TCMs reach 
cross-talk as low as —45 dB/km. By contrast, the 
cross-talk for conventional TIR bound modes 
remains ~-20 dB/km, the same as that found 
in previous investigations of MMFs (5). Fig- 
ure 3D shows that the effective areas (Aer) of 
these modes, measured from recorded inten- 
sity profiles (72), are an order of magnitude 
larger than those for SMFs or MCFs. Further- 
more, we find that all these attractive attributes 
of TCMs are maintained even when the fiber is 
bent down to radii as small as 6 mm (fig. S5). 
To characterize this fiber over kilometer 
lengths of practical utility in data center ap- 
plications, we used a Sagnac reflector to equiv- 
alently double the length of our 480-m-long 
fiber (25) (fig. S2B). Figure 4A shows the full 
transfer matrix for all the modes |L| = 25 - 42, 
RCP (Identical performance for LCP modes is 
shown in fig. S6.). Except for the outlier |Z| = 
28, 30, 33 modes (deep-blue square markers 
along the diagonal indicate TIR bound modes 


that experience mode mixing), and not count- 


wavelength (nm) 


telecommunications bands. (D) Cross-talk versus 4 for TIR bound modes and 
TCMs. The color curves are the average cross-talk; error bars indicate the 
maximum and minimum values. (E) Estimated total mode count versus A for 
TCMs and the sum of TCMs and TIR bound modes. (F) Average loss versus A of 
good bound modes, TCMs, and their sum. The dashed curve represents the 
ment loss of TCMs. 


ing the outliers in the TCM regime that exhibit 
high loss (Fig. 2A), we obtained 50 modes 
(red square markers) that exhibit cross-talk 
of ~-40 dB/km for TCMs and ~-—20 dB/km for 
conventional TIR bound modes (based on 
measured power in the antidiagonals). The 
majority of the power in the two off-diagonals 
adjacent to the launched mode are from mode- 
launch imperfections, whereas fundamental 
in-fiber, bend-induced cross-talk in these modes 
is immeasurably low. We also measured the 
polarization extinction ratios (PER) (Fig. 4B) 
(12), which increased with |L| from ~10 to 
~15 dB because of the fundamental property 
of OAM conservation (26). Figure 4C shows 
dominant (intra-|Z|) in-fiber cross-talk for 
measurements repeated for A € [1460,1590] 
nm, covering the telecommunications S and 
C band and part of the L band (as before, 
outlier modes with excessive mode mixing or 
loss were excluded). Red and blue curves of 
various shades denote TCM and TIR bound- 
mode cross-talk, respectively. The cross-talk of 
modes depicted by other colors are typically 
for TIR bound modes that are close to cutoff, 
and hence modes that resemble TCMs as wave- 
length increases. Again, the average cross-talk 
of all TCMs lies between -40 and —45 dB/km, 
whereas most bound modes appear pinned at 
a cross-talk of -20 dB/km, regardless of wave- 
length. Especially apparent is the cross-talk 
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of TIR bound modes close to cutoff, demon- 
strating that, as the guidance mechanism for 
light evolves from TIR to topological confine- 
ment, the cross-talk also improves by two or- 
ders of magnitude. In addition, Fig. 4, D to F 
depicts the cross-talk, mode count, and average 
loss, respectively, demonstrating a 1-km-long 
MMF with record-high mode counts of ~50, 
comprising ~20 conventional TIR bound modes 
with < —-20 dB/km cross-talk and ~30 TCMs with 
cross-talk < -40 dB/km across three telecom- 
munications spectral bands of interest. The 
measured cross-talk and mode counts are on 
par with those of MCFs (27), and the broad 
bandwidth promises the compatibility with 
wavelength-division multiplexing. The 10x 
larger Agg and chromatic dispersion (fig. S7C), 
as well as the relatively large differential modal 
delays (fig. S7, A and B), suggest that transmission 
in TCMs may be helpful in managing nonlinear 
signal distortions (28, 29) in telecommunications 
applications. In addition, a plethora of low-cross- 
talk modes potentially enables low cross-talk 
per superpositions of modes as well (30), as 
required for quantum transport, effectively al- 
lowing high-dimensional encoding (2) and 
sources through intermodal interaction (37). 
Average losses for TCMs (~5.0 dB/km) and for 
TIR bound modes (~1.4 dB/km) are, at their 
current maturity, reasonable for amplifier and 
inter-data center applications, but because 
confinement losses are predicted to be much 
lower, manufacturing optimizations (fig. S3) 
would make these fibers suitable for substan- 
tially longer telecommunications links. 


Summary and conclusions 


The physics of frustrated coupling in TCMs 
yields an improvement of over two orders of 
magnitude in mode-coupling resistance over 
any MMF demonstrated to date. This fun- 
damental feature has enabled cross-talk and 
channel counts that are on par with MCFs, 
while providing additional crucial benefits, 
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such as order-of-magnitude-greater mode areas 
and extreme bend tolerance. By contrast, TCMs 
are propagating modes of an MMF, and as 
such, share other favorable attributes includ- 
ing high spatial efficiency and the prospect of 
efficient, gain-equalized optical amplification 
of all channels. Thus, topological confinement 
represents a regime of light transport that 
combines the benefits of both MCFs and con- 
ventional MMFs, yielding a platform that is 
not only suitable for scaling the capacity of 
classical telecommunications and quantum net- 
works but also serves as a distinctive fiber-laser 
and nonlinear signal-processing medium on 
account of their large chromatic dispersions 
and record-large mode areas. 
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CANCER EVOLUTION 


The evolution of two transmissible cancers in 


Tasmanian devils 
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Kerstin Howe’, Michael R. Stratton’, Zemin Ning’, Elizabeth P. Murchison™* 


Tasmanian devils have spawned two transmissible cancer lineages, named devil facial tumor 1 (DFT1) and 
devil facial tumor 2 (DFT2). We investigated the genetic diversity and evolution of these clones by analyzing 
78 DFT1 and 41 DFT2 genomes relative to a newly assembled, chromosome-level reference. Time-resolved 
phylogenetic trees reveal that DFT1 first emerged in 1986 (1982 to 1989) and DFT2 in 2011 (2009 to 2012). 
Subclone analysis documents transmission of heterogeneous cell populations. DFT2 has faster mutation rates 
than DFT1 across all variant classes, including substitutions, indels, rearrangements, transposable element 
insertions, and copy number alterations, and we identify a hypermutated DFT1 lineage with defective DNA 
mismatch repair. Several loci show plausible evidence of positive selection in DFT1 or DFT2, including loss of 
chromosome Y and inactivation of MGA, but none are common to both cancers. This study reveals the parallel 
long-term evolution of two transmissible cancers inhabiting a common niche in Tasmanian devils. 


ransmissible cancers are contagious 

somatic cell lineages that spread through 

populations by the physical transfer of 

living cancer cells. Although few such 

diseases are known in nature, Tasmanian 
devils (Sarcophilus harrisii), marsupial carni- 
vores endemic to the Australian island of 
Tasmania, host at least two transmissible 
cancer clones. These cancers, known as devil 
facial tumor 1 (DFT) and devil facial tumor 2 
(DFT2), both primarily cause malignant facial 
and oral tumors that are spread by biting (Fig. 
1A) (/-3). DFT1 was first observed in 1996 in 
northeastern Tasmania and has subsequently 
spread widely (4, 5); DFT2, however, was dis- 
covered in 2014 on the D’Entrecasteaux Channel 
Peninsula in Tasmania’s southeast and is be- 
lieved to remain confined to this area (3, 6, 7). 
Both DFT1 and DFT?2 are usually fatal, and rapid 
Tasmanian devil population declines associated 
with DFT1 have led to concern for conservation 
of the species (4, 5, 8). 
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The emergence of two transmissible cancers 
in Tasmanian devils suggests that the species is 
particularly susceptible to this type of disease. 
Indeed, DFT1 and DFT2 appear to be indepen- 
dent occurrences of the same pathological pro- 
cess, and their comparison may illuminate the 
constraints of the biological niche that they in- 
habit. DFT1 and DFT2 are both undifferentiated 
Schwann cell cancers with similar dependence 
on receptor tyrosine kinase signaling (9-12). 
DFTI first arose from the cells of a female 
founder devil and equally affects male and 
female devil hosts (2, 13-15); DFT2, however, 
originated from a male devil and shows pref- 
erence for male hosts, perhaps because of 
immunogenicity of chromosome Y-derived 
antigens in female hosts (3, 7, 10). Both cancers 
escape the allogeneic immune system, and, in 
DFTI1, this is mediated by transcriptional re- 
pression of major histocompatibility complex 
(MHC) class I genes (16). In DFT2, however, 
cell surface MHC class I molecules are usually 
detectable, and the high similarity between 
expressed tumor and host MHC class I alleles 
may underlie the lack of immune rejection 
(7). The genomes of DFT1 and DFT2 show 
comparable mutational patterns, but no com- 
mon positively selected driver mutations have 
been detected (J0). Furthermore, whereas 
DFT1 has split into several spatially defined 
sublineages during its spread through Tasmania 
(18), little is known about the clonal diversity 
of DFT2. 

In addition to their importance as threats to 
animal health and their intrinsic interest as 
unusual pathogens, transmissible cancers pro- 
vide an opportunity to study how mutations in 
cancer accumulate with time. Most human 
cancer studies involve the analysis of tumor 
biopsies collected either at a single session or 
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at time points separated by short inter pines 
The long-term survival of DFT1 and DFT2\L-- 
mits repeated sampling of the same cancer 
lineages through decades, which enables di- 
rect investigation of variation in mutation 
rates, together with those of their consti- 
tutive mutational signatures, within and be- 
tween clones. 

Here, we describe high-coverage whole- 
genome sequences of 78 DFT1 and 41 DFT2 tu- 
mors, as well as that of a single nontransmissible 
carcinoma and a panel of 80 normal Tasmanian 
devil genomes, that were analyzed relative to a 
newly assembled, highly contiguous Tasmanian 
devil reference genome. By capturing the so- 
matic genetic diversity present within the DFT1 
and DFT2 lineages, our goal was to understand 
the dynamics of these diseases’ emergence 
and spread, to estimate their mutation rates, 
and to characterize their long-term patterns 
of evolution. By intersecting findings from dif- 
ferent Tasmanian devil cancers, we identify 
genomic events that underpin transmissible 
cancer in this species. Our analysis provides 
detailed insight into the evolution and diver- 
sification of two parallel cancer clones that 
have survived in a transmissible niche. 


An improved reference genome for the 
Tasmanian devil 


Previous Tasmanian devil genome assemblies 
were highly fragmented (13, 19, 20). To produce 
an improved genome assembly for the species, 
we extracted high molecular weight DNA from 
the female fibroblast cell line used in an earlier 
assembly (13). We sequenced this to 76-fold and 
12-fold coverage using long-read (fragment N50: 
9.05 kilobases, kb) and ultralong-read (N50: 
57.13 kb) sequencing technology (27). In addi- 
tion, DNA was analyzed with optical mapping, 
linked-read sequencing, and high-dimension 
conformation capture (Hi-C). A reference ge- 
nome assembly, mSarHar1.11, was generated by 
combining these data (Table 1, table S1, and fig. 
S1). Notably, 99.8% of bases were placed on one 
of seven scaffolds, which correspond to the 
six devil autosomes and chromosome X. Ge- 
nome annotation was performed with the 
Ensembl gene annotation pipeline (22), guided 
by a newly sequenced Tasmanian devil multi- 
tissue transcriptome atlas, and yielded 19,228 
protein-coding gene models (table S1). 


DFT1 and DFT2 phylogenies 


To investigate genetic variation within Tasma- 
nian devil transmissible cancers, we sequenced 
the whole genomes of 63 DFT1s and 39 DFT2s 
(Fig. 1A) to a median depth of 83x and 
analyzed these alongside 15 DFT1 and 2 
DFT2 publicly available genomes (table S2). 
The DFTIs were primarily selected to capture 
genetic and spatiotemporal diversity in this 
clone (Fig. 1B and table S2). These included 
representatives of the six major clades (A1, 
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non-transmissible 
carcinoma 


Fig. 1. DFT1 and DFT2 phylogenies. (A) Representative photographs of animals 
infected with DFT1 and DFT2. (B) Sampling locations of 78 DFT1 and 41 DFT2 
aximum likelihood phylogenetic tree 
constructed with 104,799 somatic and 1,070,436 germline substitutions from 
38 DFT1s, 12 DFT2s, a single nontransmissible carcinoma, and 79 Tasmanian 
devils; only the subset of DFT1s and DFT2s with tumor purity =75% were 
included. Black unlabeled tips represent Tasmanian devils, and shaded tips 
represent those belonging to DFT1, DFT2, or the nontransmissible carcinoma. 
Branch lengths are uninformative. High-resolution labeled tree with bootstrap 
support is available in fig. S2. (D and E) Time-resolved phylogenetic trees 

for DFT1 and DFT2 constructed with 171,283 and 21,252 somatic substitution 


tumors included in the study. (C) 


A2, B, C, D, and E) (/8) and were collected 
from 38 locations between 2003 and 2018. 
For DFT2, we sequenced all available tumors 
sampled between 2014 and 2018, which all 
occurred within DFT2’s known range on the 
D’Entrecasteaux Channel Peninsula (Fig. 1B). 
Some subsets of DFT1 and DFT2 tumors were 
derived from the same individual hosts. These 
included sets of matched primary facial tumors 
and internal metastases as well as samples 
from distinct facial or body tumors occurring 
in single hosts (table S2). In addition, we se- 
quenced a nontransmissible anal sac carcinoma 
sampled from a captive Tasmanian devil and 
analyzed genomes from 80 normal Tasmanian 
devils, including matched hosts (71 newly se- 
quenced, 9 publicly available) (table $2). 
Single-base substitutions were called in 
each sample, and normal Tasmanian devil 
genomes were used to identify and exclude 
germline substitutions from tumor sequen- 
ces. This yielded 205,890, 23,152, and 5764 
somatic substitutions in DFT1, DFT2, and 
the nontransmissible anal sac carcinoma, re- 
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mutations, respectively. Tumor clades are labeled (Al, A2, B, C, D, and E in DFT1; 
A and B in DFT2). Bars at internal nodes represent 95% Bayesian credible 
intervals around date estimates. Bars at root nodes represent 95% Bayesian 
credible intervals around date estimates and incorporate uncertainty in somatic 
or germline assignment of substitutions shared by all tumors within a clone and 
absent from all normal Tasmanian devils. Dating is based on tumor sampling 
dates and does not account for the pretransmission interval, the offset between 
date of clone emergence, and date of sampling; this is of relevance because bulk 
tissue sequencing captures only clonal mutations or those present in sizeable 
subclones (55). High-resolution labeled DFT1 and DFT2 trees with node posterior 
probability are available in figs. S3 and S5. 


a a E:| 
Table 1. mSarHar1.11 Tasmanian devil reference genome assembly and annotation metrics. 


Metric 
Contigs (N50) 


Value 
445 (63.34 Mb) 


Repeat-masked genome 
C 
Noncoding genes 


spectively, as well as 1,458,776 germline 
variants (table S3). Analysis of the latter re- 
vealed a median of 0.132 heterozygous sites 
per kilobase (range 0.083 to 0.153) in the 
sampled population of Tasmanian devils, with 
the DFT1 and DFT2 founder devils both fall- 
ing within this range (table S3). 

We confirmed the independent clonal ori- 
gins of DFT1 and DFT2 by constructing a 
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4336 


maximum likelihood tree using substitutions 
from both tumor and normal samples. As 
expected, DFT1 and DFT2 tumors each clus- 
tered into distinct groups whose positions 
relative to normal animals are consistent 
with the notion that these clones’ founder 
devils originated in northeastern Tasmania 
(DFT1) or on the D’Entrecasteaux Channel 
Peninsula (DFT2) (Fig. 1C and fig. S2) (3, 4, 0). 
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Fig. 2. Intratumor genetic heterogeneity in DFT1 and DFTZ. (A to C) 
Example of heterogeneous cell transmission in DFT2. 150971 is a DFT2 clade 

B tumor composed of two detectable subclonal cell populations, 1509T1subcionet 
and 1509T1subcione2: (A) Computational separation of 1509T1subcloner and 
1509T1subclone2 and inclusion on a phylogenetic tree revealed subclone 
membership of distinct clade subgroups, DFT2-B3 and DFT2-B2. Branch lengths 
are proportional to number of substitution variants. (B) Variant allele distribution 
of 1509T1, together with those of representative DFT2-B3 (152912) and DFT2-B2 
(1334T1) tumors; only variants occurring after the split between DFT2-B3 (dark 
gray) and DFT2-B2 (light gray) are included. Because tumors are diploid, most 
mutations occur in the heterozygous state and would be expected to be found at 
50% proportion. (C) Model illustrating transmission of DFT2 from an earlier 
donor devil, which carried both DFT2-B3 (dark gray) and DFT2-B2 (light gray) 
cells, to recipient devils. Recipient tumors are composed either of clonal 
populations of DFT2-B3 (upper, dark gray, 152972), clonal populations of DFT2- 
B2 (lower, light gray, 1334T1) or a subclonal mixture of DFT2-B3 and DFT2-B2 
(middle, mixture of light gray and dark gray, 150971). Arrows do not necessarily 
represent direct transmission. (D to F) Example of differential transmission and 
metastasis of subclones in DFT1. 139T1 is a DFT1 facial tumor composed of two 
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detectable subclonal cell populations, 139T1subcioner ANd 139TIsubclone2, repre- 
sented by dark gray and light gray shading, respectively. (D) 139T1subcioner 
clusters phylogenetically with a set of internal metastases sampled from the 
same individual (139T4, 139T5, 139T6), and 139T1subcione2 Clusters with facial 
tumors sampled from three devils (140T, 141T, 142T) involved in a DFT 
transmission chain (fig. S6). Branch lengths are proportional to number of 
substitution variants. (E) Variant allele distribution of 139T1, together with those 
of a representative metastasis involving the same host (139T4) and a 
representative tumor secondary to transmission involving a different host (1407); 
only variants occurring after the split between the metastases (dark gray) and 
transmission (light gray) are included. Because tumors are diploid, most 
mutations occur in the heterozygous state and would be expected to be found at 
50% proportion. (F) Model illustrating differential spread of subclones. 
139TIsubcioner (dark gray) and 139TIsubclonez (light gray) are both represented in 
tumor 13971. Cells belonging to 139T subclone: Seeded internal metastases 
(represented by 139T4), whereas cells from 139T1supcione2 were transmitted 
onwards to recipient devils (represented by 140T). Further details available in 
fig. S6. The Tasmanian devil silhouette used throughout this figure is adapted 
from Nilsson et al. (56). 


Time-resolved phylogenetic trees were gen- 
erated for DFT1 and DFT2 with substitution 
mutation rates inferred by using tumor sampl- 
ing dates (Fig. 1, D and E). By assuming a 
constant mutation rate, DFT1 was estimated 
to have arisen in 1986 (95% Bayesian credible 
interval 1982 to 1989), which implies a sub- 
stantial delay from its emergence until its 
first observation in 1996 (Fig. 1D and fig. S3) (4). 
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The DFTI1 tree showed the expected arrange- 
ment of the six identified tumor clades (18) and 
revealed that these split from one another very 
early in DFT1 evolution in a rapid diversification 
event that almost certainly involved a single 
tumor donor (fig. $4). DFT2, however, is esti- 
mated to have first emerged in 2011 (95% 
Bayesian credible interval 2009 to 2012). It 
subsequently split into two major sympatric 
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groups, which we term DFT2 clades A and B 
(Fig. 1E and fig. S5). The potential for individual 
devils to be coinfected with distinct lineages of 
DFT1 (18), DFT2, or both (0) is apparent. 
The presence of true or near-polytomies evident 
on both the DFT1 and DFT2 phylogenetic trees, 
which are defined by very short internal bran- 
ches (Fig. 1, D and E), suggests that it may not 
be uncommon for infectious devils to transmit 
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their tumor to more than two secondary hosts. 
Such events may, however, be enriched at early 
time points in the trees because of survivorship 
bias (23). 


Intratumor genetic heterogeneity in DFT1 
and DFT2 


Bulk sequencing of tumor tissue, as performed 
here, will capture only clonal mutations or those 
present in sizeable subclones. When present, 
however, the distribution of subclones among 
tumors could be informative about the clon- 
ality of transmission in DFT1 and DFT2. 
We screened tumors for subclones by 
searching for mutation populations that 
showed unexpected allele fractions. One 
DFT2 tumor, 1509T1, was found to be com- 
posed of two subclonal cell populations that 
were represented at ~60 and 40% frequency, 
respectively. We computationally isolated 
these subclones, and inspection of their posi- 
tions on the DFT2 phylogenetic tree revealed 
that they belonged to separate DFT2 clade B 
sublineages, which we term DFT2-B2 and 
DFT2-B3 (Fig. 2, Aand B). Indeed, mutations 
defining each subclone were observed clonally 
in related contemporaneous tumors from dif- 
ferent hosts. These data are compatible with 
a model in which an earlier donor tumor con- 
tained cells belonging to both DFT2-B2 and 
DFT2-B3; onward transmission founded descend- 
ent tumors composed of either DFT2-B2 or 
DFT2-B3 cells or, in the case of 1509T1, a mixture 
of both DFT2-B2 and DFT2-B3 cells (Fig. 2C). 
We similarly investigated intratumor heter- 
ogeneity in DFT1 using a closely related set of 
tumors that were part of a series of direct 
transmission events (Fig. 2D and fig. S6). This 
case involved a female devil with a facial tumor 
and several metastases. Cells were transmitted 
from this female’s facial tumor to her unweaned 
male offspring, who, once weaned, further 
transmitted his tumor to two additional hosts 
while the group was housed together in cap- 
tivity (fig. S6). The index female’s facial tumor 
was composed of two detectable subclones at 
~90 and 10% proportions, which clustered 
with the tumor of the offspring and with her 
metastases, respectively (Fig. 2, D and E). 
This suggests that two distinct cell lineages, 
both represented within the index facial tumor, 
differentially contributed to metastatic dissem- 
ination and onward transmission (Fig. 2F). 
These case studies hint at the genetic het- 
erogeneity present within individual DFT tumors 
and, in the DFT2 example, imply that this di- 
versity can be maintained across transmission 
bottlenecks. Thus, at least in some cases, DFT 
tumors are seeded by more than one cell. 


DFT1 and DFT2 substitutions and indels 


To obtain an overview of the mutational pro- 
cesses operating in Tasmanian devil cancers, 
we inspected each tumor’s mutational spec- 
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Table 2. Summary of DFT1 and DFT2 mutation rates. Mutation rates were estimated by using 
linear regression except for Substitutions (BEAST), which was estimated by using a Bayesian 
phylogenetic approach (57). Rates represent mutation count per genome per year. These can be 
converted to mutation count per nucleotide per genome per year by dividing by callable genome 
size (2,983,/50,195 nucleotides). Rate ranges represent 95% confidence interval of the linear 

fit except for Substitutions (BEAST), for which range represents 95% Bayesian credible interval. DFT1 
hypermutator clade E was excluded from substitution and indel rate calculations. Ratio ranges 
represent error-propagated 95% confidence intervals. Level of significance of F test for linear fit is 
shown, ratios of mutation classes which did not show significant linear fits are not displayed. *F test 


P < 0.01; **F test P <1 x 10°*: ns, not significant. 


Mutation class 
Substitutions (BEAST) 


DFT1 rate, per year 
215.5 [212.2 to 218.5] 


516.7 [505.8 to 527.7] 


DFT2:DFT1 rate ratio 
2.397 [2.336 to 2.457] 


DFT2 rate, per year 


Copy number events 


07 (0.2 to 1i}* 


trum, a representation of the distribution of 
mutations across the six base substitution 
classes, displayed together with their immedi- 
ate 5’ and 3’ base contexts. Such spectra can be 
decomposed into their constituent mutational 
signatures, patterns of co-occurring mutation 
types that reflect the activities of underlying 
endogenous or exogenous mutational pro- 
cesses (24). As expected, DFT1 and DFT2, as 
well as the single nontransmissible anal sac 
carcinoma, showed evidence for the presence 
of two known mutational signatures, single- 
base substitution signatures 1 (SBS1) and 5 
(SBS5), which are found almost universally 
in human cancer (25) and have been described 
previously in Tasmanian devil tumors (Fig. 3A 
and fig. S7) (0). SBS1 is characterized by C>T 
mutations at CpG dinucleotide contexts and 
is believed to primarily arise because of spon- 
taneous deamination of 5’-methylcytosine 
(24). SBS5, however, shows little base speci- 
ficity, and its etiology is poorly understood 
(25, 26). Consistent with a previous report (0), 
no evidence of ultraviolet light mutagenesis 
was detectable in DFT1 or DFT2 mutation pat- 
terns, which indicates that the cells that transmit 
DFT are not usually exposed to sunlight. Pat- 
terns of short insertions and deletions (indels) 
in DFT1 and DFT2 revealed imprints of indel 
signatures 1 (ID1) and 2 (ID2) in both cancers 
(25), although ID1 dominated in DFT1 (66% ID1, 
34% ID2) whereas ID1 and ID2 were present at 
similar proportions in DFT2 (47% ID1, 53% 
ID2; Fig. 3B and fig. S8). These signatures are 
defined by the accumulation of insertions 
(ID1) or deletions (ID2) of single thymine or 
adenine bases that occur at mononucleotide 
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5.6 [4.7 to 6.5]** 8.391 [7.701 to 9.080] 


tracts and arise through polymerase slippage 
that involves the nascent (ID1) or the template 
(ID2) DNA strand (25). 

Mutational signatures SBS1, SBS5, ID1, and 
ID2 all present clock-like properties in human 
cells, defined by linear correlation with do- 
nor age (25, 27, 28). Their rates vary widely 
among tissues, and, whereas the rates of SBS1, 
IDI, and ID2 correlate with one another and 
are believed to reflect the number of mitoses 
that a cell has experienced, SBS5 rate is inde- 
pendent of these (25). We characterized overall 
substitution and indel rates, as well as rates of 
SBSI1, SBS5, ID1, and ID2 in DFT1 and DFT2 by 
regressing the number of mutations attributa- 
ble to each signature in each tumor against 
sampling date (Fig. 3, C to F). These analyses 
revealed that overall substitution and indel 
mutation rates in DFT2 were 3.0 and 3.9 
times higher, respectively, than those of DFT1 
(Table 2). SBS1 and ID1 accumulate only mo- 
derately faster in DFT2 than in DFT], but rates 
of SBS5 and ID2 are both considerably higher 
in DFT2 than in DFT (Table 2; Fig. 3, E and F; 
and table S3). 

The relationship between substitution bur- 
den and sampling date is linear in both DFT1 
and DFT2. Nevertheless, a group of DFT1 tumors 
can be observed with fewer substitutions attrib- 
utable to both SBS1 and SBS5 than expected (Fig. 
3G). These tumors belong to a single branch of 
the phylogenetic tree, clade C2/3, which corres- 
ponds to the group of clade C tumors sampled in 
northwest Tasmania (Fig. 3G). The mutation rate 
we inferred when considering only these tumors 
(179 mutations per year, 95% confidence interval 
131 to 227) is similar to that of the remaining 
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Fig. 3. DFT1 and DFT2 substitutions and indels. (A and B) Mutational 

spectra for somatic substitutions (A) and indels (B) in DFT1 (blue, n = 176,428 
substitutions, n = 22,479 indels; variants specific to DFT1 clade E were excluded) 
and DFT2 (red, n = 23,152 substitutions, n = 4054 indels). Fully labeled plots 
available in figs. S7 and S8. (€ and D) Rate of accumulation of substitutions 
(C) and indels (D) in DFT1 excluding clade E (blue) and DFT2 (red). Each point 
represents a tumor, plotted by sampling date. Lines represent linear regression, 
gray shading 95% confidence interval. R*, coefficient of determination. (E) Rate 
of accumulation of substitution mutations corresponding to mutational 
signatures SBS] (left) and SBS5 (right) in DFT1 excluding clade E (blue) and 
DFT2 (red). Each point represents a tumor, plotted by sampling date. Lines 
represent linear regression, gray shading 95% confidence interval. (F) Rate of 
accumulation of substitution mutations corresponding to mutational signatures 
ID1 (left) and ID2 (right) in DFT1 excluding clade E (blue) and DFT2 (red). Each 
point represents a tumor, plotted by sampling date. Lines represent linear 
regression, gray shading 95% confidence interval. (G) Transient reduction in 
DFT1 substitution mutation rate occurring within phylogenetic branch leading to 
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DFT1 clade C2/3 (arrow and shading, left); tumors in DFT1 clade C2/3 occur in 
Tasmania's northwest (map). Overall mutation rate reduction (center) is 
attributable to both mutational signatures SBS1 and SBS5 (second from right, 
right); each point represents a tumor, plotted by sampling date, with clade C2/3 
tumors represented as triangles. Lines represent linear regression, gray 
shading 95% confidence interval. (H) The single representative of DFT1 

clade E, sampled in northeast Tasmania (tree, map) has elevated numbers of 
substitution and indel mutations; central plots show numbers of substitutions 
(left) and indels (right) in all DFT1 tumors plotted by sampling date, with 

clade E tumor represented by triangle. Clade E has distinctive substitution and 
indel mutational spectra, with at least 60% of the spectrum explained by 
signature SBS6 (second from right; fully labeled plots available in fig. S9). Clade 
E carries a deletion encompassing the MLHI locus (right; dots represent 
normalized read coverage within 1-kb genomic windows, with windows 
including MLH1 shaded in black; MBP, mega-base pairs; connecting arcs 
represent rearrangements). High-resolution images and source data available 
in figs. S7 to S9 and table S3. 
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Fig. 4. LINE-1 transposable element activity in DFT1 and DFT2. (A) Rate of 
LINE-1 insertion accumulation in DFT1 (blue) and DFT2 (red). Each point 
represents a tumor, plotted by sampling date. Lines represent linear regression, 
gray shading 95% confidence interval. (B) DFT2 3' transduction activity of a 
LINE-1 source element at chromosome 1:516.6 megabases (Mb) (star). In the 


DFT1 tumors (202 mutations per year, 95% 
confidence interval 166 to 238); however, there 
are ~1200 fewer mutations genome-wide in 
the overall clade C2/3 burden than expected. 
Indeed, clade C2/3 tumors accounted for a 
significant fraction of the variance in the lin- 
ear fit for substitutions, attributable to both 
SBS1 and SBS5, regressed against time (Fig. 
3G). These observations suggest that a transient 
reduction in mutation rate occurred during the 
chain of transmissions taking place between 
1991 and 2003 that transported DFT1 into 
Tasmania’s northwest, perhaps due to a tem- 
porary reduction in cell division rate. Such 
fluctuations in mutation rate may not be 
uncommon, with detection in this particular 
case made possible because of the long inter- 
nal branch and particularly dense sampling 
of DFTI1 clade C2/3. 


A DFT1 hypermutator lineage 


Although most DFT1 and DFT2 tumors have 
very similar mutational spectra, a single DFT1 
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tumor, the unique representative of the early 
divergent clade E, named 377T1, had a highly 
distinctive pattern of mutations (Fig. 3H). Sig- 
nature fitting suggested that, in addition to SBS1, 
SBS5, IDI, and ID2, this tumor also carried 
mutations attributable to mutational signa- 
tures SBS6 and ID7 (fig. S9). Furthermore, 
377T1 carried 6 and 10 times more substitu- 
tions and indels, respectively, than expected 
from other DFT1 tumors that were sampled 
at a similar time (Fig. 3H). Because SBS6 and 
ID7, as well as elevated activity of ID1 and ID2, 
have been linked to deficiencies in DNA mis- 
match repair (25, 26), these observations suggest 
that a clonal ancestor of 377T1 lost mismatch 
repair function. To identify the lesion that dis- 
rupted mismatch repair in 377T1, we screened 
the sequences of genes that encode mismatch 
repair effectors in DFT1 tumor genomes and 
discovered a focal deletion specific to 377T1 
that removed a single copy of MLH1 (Fig. 3H). 
Supporting a role for this gene, the 377T1 
mutational spectrum is highly reminiscent of 
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circos plot, chromosomes are represented by black bars, and red arcs connect 
source element to 3’ transduction integration site. (©) DFT2 phylogenetic tree 
as shown in Fig. 1E with circos plots illustrating temporal activity of the 

LINE-1 source element located at chromosome 1:516.6 Mb. Nodes corresponding 
to each circos plot are represented in red. Source data available in table S4. 


that reported in human cells that lack MLHI 
(29). No mutations, however, were detected in 
the remaining copy of MLH1, and we spec- 
ulate that this may have been transcriptional- 
ly silenced, for example, by promoter DNA 
methylation. 


Transposable element activity in DFT1 
and DFT2 


Transposable elements are frequently active in 
human cancer (30), but it is not known whe- 
ther these are mobilized in Tasmanian devil 
cancers. Several families of transposable ele- 
ments are annotated in mSarHar1.11, includ- 
ing 1948 full-length long interspersed nuclear 
element 1 (LINE-1) retroelements (table S1). 
We systematically screened for somatic LINE-1 
insertions in DFT1 and DFT2 and found high 
LINE-1 transposition activity in DFT2, with 
hundreds of insertions detected. In DFT1, 
however, no clear evidence of LINE-1 activity 
was found (table S4). LINE-1 mobilization 
events were observed throughout the DFT2 
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Fig. 5. Genome rearrangement in DFT1 and DFT2. (A) Rearrangement and 
copy number profiles of the DFT1 (left, blue) and DFT2 (center, red) most recent 
common ancestor tumors (trees, arrows; DFT1 and DFT2 trees as shown in Fig. 1, 
D and E, respectively). Chromosomes are represented by gray blocks annotated 
with copy number state. Inner arcs represent rearrangements. Right, rearrange- 
ment and copy number profiles of a single Tasmanian devil nontransmissible 
carcinoma. The location of the highly amplified E542K mutation in PIK3CA 

is labeled (asterisk). (B) Rates of accumulation of rearrangement events (left: 
“events” denotes that clustered rearrangements have been merged) and CNVs 
(right) in DFT1 (blue) and DFT2 (red). Tumors are represented by points, 
plotted by sampling date. Lines represent linear regression, gray shading 95% 
confidence interval. (€) Example of a late chromothripsis event in DFT1. A single 
DFT1 tumor (blue dot, arrow on phylogenetic tree) carries a chromothripsis event 
on chromosome 1; on the circos plot, rearrangements specific to the affected 
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tumor are drawn in blue, and shared rearrangements that were acquired before 
this tumor’s divergence are drawn in black. Right, copy number plot illustrates 
rearrangements involving the chromothriptic region (arcs; blue arcs are specific 
to this tumor, black arcs are shared with other tumors), and copy number is 
illustrated with binned coverage; each bin represents normalized read coverage 
in a 1-kb window. (D and E) Examples of chromoplexy events in DFT1 (left) and 
DFT2 (right). In both cases, positions of nodes represented by each circos plot 
are illustrated on the relevant phylogenetic tree, either along a four-step time- 
resolved (T, to T4) branch trajectory in DFT1 (D) or throughout the DFT2 
phylogeny (E). Chromosomes are represented by black blocks and rearrange- 
ments by colored arcs. (F) Timing of whole-genome doubling events in DFT1 
(15 events) and DFT2 (3 events). The estimated date of each whole-genome 
duplication is illustrated on tree with colored dot. Further information and source 
data are available in figs. S10 and S11 and tables S5 and S6. 


phylogenetic tree and accumulated linearly 
with time (Fig. 4A, Table 2, and table S4). 
Transcriptional readthrough occasionally 
mobilizes genomic DNA downstream of LINE-1 
source elements in a process known as 3’ 
transduction (30). A subset of DFT2 LINE-1 
insertions carried 3’ transductions, which iden- 
tified 35 functional LINE-1 source elements 
in DFT2 (table $4). Although most DFT2 source 
elements could be associated with only a single 
LINE-1 3’ transduction event, one source ele- 
ment located on chromosome 1 spawned at 
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least 29 LINE-1 3’ transductions, with ac- 
tivity continuing throughout the DFT2 phylo- 
genetic tree (Fig. 4, B and C). Overall, these 
findings reveal that LINE-1 retroelements are 
transposition competent in Tasmanian devil 
genomes and that their activity varies sub- 
stantially between DFT1 and DFT2. 


Genome rearrangement in DFT1 and DFT2 


The availability of mSarHar1.11 enabled de- 
tailed reconstruction of the chromosomal re- 


arrangements that initiated DFT1 and DFT2. 
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The genome catastrophe that marked the origin 
of DFT1 is focused on the tip of the long arm of 
chromosome 1 (10, 14, 31). This region is mas- 
sively internally rearranged through dozens of 
inversions interspersed with short deletions 
and interchromosomal translocations (Fig. 5A 
and tables S5 and S6). These changes are com- 
patible with a complex chromothripsis event, as 
previously proposed (7/4). The early rearrange- 
ments of DFT2 are less clustered than those 
of DFT1 (Fig. 5A and tables S5 and S6) (J0). 
Chromosome ends are notably involved in 
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rearrangement in both DFT1 and DFT2, which 
is consistent with a role for telomere dysfunc- 
tion in DFT initiation (0, 14, 31). 

The genome of the spontaneous nontrans- 
missible anal sac carcinoma showed extensive 
rearrangement and copy number alteration 
(Fig. 5A and tables S5 and S6). This cancer’s 
pattern of stepwise amplification is com- 
patible with the activity of several breakage- 
fusion-bridge cycles. It is notable that the copy 
number landscape of this tumor is markedly 
more complex than those of the respective 
most recent common ancestors of DFT1 and 
DFT2, which indicates that, as in humans, 
there are several routes to carcinogenesis in 
Tasmanian devils. This is important because 
it implies that the mutational patterns observed 
in DFT1 and DFT2 are typical of DFT, not of 
Tasmanian devil cancer in general. 

Rearrangement events and copy number 
variants (CNVs) both accumulated linearly with 
time in DFT2 (Fig. 5B and tables S5 and S6). 
Although slight temporal increases were de- 
tected in DFT1, these were only marginally 
significant, which confirms previous findings 
that the rate of genomic structural change in 
DFT1 is barely detectable above background 
variation among sublineages (78). Despite this, 
it is noteworthy that the group of DFT clade 
C2/3 tumors that carried fewer substitution 
mutations than expected (Fig. 3G) also showed 
fewer rearrangement events and CNVs (fig. S10), 
which suggests that the transient reduction in 
mutation rate occurring on the westward trans- 
mission chain operated across mutation classes. 

The spectra of polymorphic (i.e., occurring 
after each lineage’s most recent common an- 
cestor) genomic rearrangements in DFT1 and 
DFT2 were similar, with small-scale altera- 
tions dominating (tables S5 and S6). Several 
more complex events were also observed in 
both lineages, however, including occasional 
chromothripsis (Fig. 5C) and ongoing chro- 
moplexy (Fig. 5, D and E). We investigated 
the genomic contexts and haplotype specificity 
of a subset of CNVs that were observed to occur 
repeatedly either within or between DFT 
lineages (78); one of these was associated with 
repetitive structural features that likely trigger 
genome instability (table S6). Copy-neutral vari- 
ation in minor copy number was rare in DFT1 
and undetectable in DFT2, which is consistent 
with these tumors’ overall patterns of copy num- 
ber stability (78). 


Whole-genome doubling in DFT1 and DFT2 


Among the 78 DFT1 and 41 DFT2 tumors an- 
alyzed, 16 DFT1s and 3 DFT2s were identi- 
fied as likely tetraploid, which defined 15 DFT1 
and 3 DFT2 whole-genome duplication events. 
By counting the number of substitution mu- 
tations occurring before and after genome 
duplication in each tetraploid lineage and 
applying the previously inferred substitution 
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mutation rates, we estimated the dates upon 
which genome doubling occurred. This iden- 
tified whole-genome duplications that predated 
sampling of tumors by up to 7 years (median 1.8) 
(Fig. 5F, fig. S11, and table S6). DFT tumors that 
had undergone genome duplication showed an 
increased frequency of whole-chromosome or 
whole-chromosome-arm gain or loss events, 
compared with diploid tumors (Fisher’s exact 
test P < 0.01) (table S6). This may at least in 
part be due to mitotic spindle defects intro- 
duced secondary to centrosome duplication 
(32) or due to a shortage of chromosome rep- 
lication effectors in the first cell cycle after 
genome doubling (33); alternatively, it is 
possible that such large-scale aberrations are 
better tolerated in the tetraploid state. 


Signals of selection in DFT1 and DFT2 


The mutations that initiated DFT1 remain 
unknown, although a number of candidates 
have been proposed (JO, 11, 31). It seems al- 
most certain that the catastrophic event at 
the origin of DFT1 produced one or more 
driver mutations. The complex disruption 
of a single copy of LZTRI (31) is the most 
plausible driver candidate associated with 
this event (Fig. 6, A and B). In DFT2, focal 
copy number amplification of PDGFRA is 
shared by all DFT2 tumors and remains a 
strong early driver candidate (Fig. 6A) (0). 
In contrast to DFT1 and DFT2, the non- 
transmissible carcinoma carries recognizable 
driver mutations in well-characterized cancer 
genes (E542K PIK3CA mutation amplified to 
>60 copies; T7P53 truncation; NOTCH2 muta- 
tions) (Fig. 5A and tables S6 and S7). Overall, 
the paucity of clear early driver mutations in 
DFT1 and DFT2, as well as the absence of cau- 
sative cancer genes shared by both lineages, 
suggests that these cancers arose from a cell 
type that, perhaps by virtue of its epigenetic or 
transcriptional state, was predisposed to car- 
cinogenesis and required only minimal genetic 
perturbation to produce transmissible cancer. 

To explore ongoing evolution in DFT1 and 
DFT2, we first used dNdScv (34) to analyze evo- 
lutionary signal among substitution and indel 
mutations (Fig. 6C and table S8). This provided 
no evidence for widespread negative selection 
acting to remove deleterious mutations from 
the coding genomes of DFT1 or DFT2. However, 
a single gene in DFT1, MGA, which encodes a 
transcription factor that opposes MYC activ- 
ity, showed plausible signs of positive selection 
through repeated truncation (global likelihood 
ratio test q < 0.005) (Fig. 6D). MGA has been 
implicated in cancer, although its driver status 
is not confirmed (35, 36), and occurs in a hap- 
loid state in nearly all DFT 1s (Fig. 6A). 

Next, we searched for evidence of late 
drivers involving copy number variation. We 
created a chromosome map that displays to- 
tal CNV burden within the sampled DFT1 
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and DFT2 population and examined this for 
focal amplification (Fig. 6E). This screen 
detected the previously described repeated 
amplification of PDGFRB in DFT1 (Fig. 6, A 
and E) (10, 18) and indicated that further 
copy number gains of the early PDGFRA 
amplicon in DFT2 have occurred repeatedly 
in DFT2 clade A (Fig. 6, A and E). This analysis 
also identified two known recurrent focal 
amplifications on chromosomes 4 and 5 in 
DFTI, the latter containing HMGA2, and the 
former carrying 16 genes including BIRC5 (18). 
In addition, although they are not recurrent, 
the focal amplification of RAC7 to four copies 
in asingle DFT1 and focal homozygous deletion 
of PTEN in one DFT2 stand out as potential late 
driver events (table S6). 

DFT2 arose from a male founder devil and 
thus carries chromosome Y. The skew toward 
male hosts that is present in the DFT2 pop- 
ulation (7), as well as a previous observation 
that chromosome Y had been lost from a 
single female DFT2 host, prompted specu- 
lation that loss of chromosome Y (LoY) may 
be under positive selection in DFT2 by reducing 
the immunogenicity of this cancer in female 
hosts (10). We investigated this hypothesis by 
analyzing the copy number of chromosome Y 
in our panel of DFT2 tumors. We detected five 
LoY events throughout the phylogeny of the 41 
DFT2 tumors analyzed, one of which occurred 
in the ancestor of DFT2 clade B and is shared 
among all tumors of this group (Fig. 6, A and 
E). Somatic LoY is commonly observed in 
human normal and cancer cells, and the role 
of selection in driving this alteration in these 
contexts is poorly understood (37-39). Thus, 
although suggestive, we cannot confirm that 
DFT2 LoY is under positive selection; indeed, 
somatic LoY was observed in the analyzed 
nontransmissible devil anal sac carcinoma 
(table S6). However, it is noteworthy that a 
previous study that tracked the karyotype of 
a chrY* DFT2 cell line through two hundred 
passages in vitro made no mention of LoY in 
this immunologically neutral setting (40). If 
the presence of chromosome Y is indeed an 
immunological barrier to the colonization of 
female hosts, then no sex imbalance would 
be expected among hosts of chrY  DFT2. 


Discussion 


The assembly of a highly complete and con- 
tiguous reference genome for the Tasmanian 
devil has enabled comprehensive genomic 
characterization of this species’ two transmis- 
sible cancers. DFT1 and DFT2 are independent 
realizations of a common biological phenom- 
enon. Although the two cancers are overall 
highly similar in their genome features, espe- 
cially when compared with a nontransmissible 
Tasmanian devil cancer, several differences 
exist; this ecological niche will tolerate differ- 
ent forms. 
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Fig. 6. Signals of selection in DFT1 and DFT2. (A) Phylogenetic positions of 
candidate driver mutations in DFT1 (blue) and DFT2 (red). Upward-pointing 
triangles and “+” notation represent copy number amplifications; downward- 
pointing triangles and “—" notation represent copy number losses or gene 
inactivation events. Multiple gains or losses in the same phylogenetic node are 
only represented once. DFT1 and DFT2 trees as shown in Fig. 1, D and E, 
respectively. Chr, chromosome. (B) Rearrangement of a single copy of LZTRI in 
DFT1. LZTR1 (exons represented by black boxes, introns with black connectors) 
occurs within the densely rearranged region of chromosome 1 that is common 
to all DFT1s (circos plot; black bars represent chromosomes and blue arcs 
represent rearrangements common to all DFT1s; table S5). The location of each 
rearrangement in LZTR1 is represented by a triangle, with the coordinates of each 
partner locus labeled. (C) Normalized ratio of nonsynonymous-to-synonymous 
substitutions and indels (dN/dS) in DFT1 and DFT2. Dashed line indicates 
dN/dS = 1 (neutrality), and bars represent 95% confidence intervals. (D) Genomic 
representation of the MGA locus on chromosome 2 in DFT1, exons represented 
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*ChrY not shown to scale 


by black boxes, introns with black connectors. Blue triangles represent the 

six coding mutations identified in this gene, all of which are truncating (tables S7 
and S8). 5' UTR, 5’ untranslated region. (E) Map representing CNVs detected 
within the sampled cohort of 78 DFT1 (upper, blue) and 41 DFT2 (lower, 

red) tumors. Chromosomes are represented horizontally, with chromosome Y 
not shown to scale. Each CNV is represented by a colored bar, with copy number 
gains illustrated above the gray chromosome representation (“gain depth”) 

and copy number losses illustrated below the chromosome representation (“loss 
depth”). Mitotically inherited CNVs are represented once; thus, each colored bar 
represents a single CNV occurrence. CNVs that co-occur in the same tumors, 
and are thus likely to be linked, are connected with colored arcs; in DFT1, the set 
of linked losses are associated with the unstable small chromosome known as 
marker 5 (18). Arrows label candidate driver genes or genomic coordinates 
associated with prominent focal amplicons. Data associated with this figure are 
available in tables S6 to S8. Table S6 shows haplotype phasing of selected 
recurrent CNVs. 
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A particularly notable difference between 
DFT1 and DFT2 is the elevated mutation rate, 
observable across mutation classes, of DFT2 
(Table 2). One explanation for this would be 
that DFT2 has a faster cell division rate than 
DFT! and thus greater opportunity for the 
accrual of mutations associated with DNA 
replication. If true, this might influence 
relative growth rates and generation times of 
DFT1 and DFT2, with potentially complex epi- 
demiological implications. However, other 
differences in cell state unrelated to division 
rate, perhaps, for instance, associated with 
differentiation state of the two cancers’ cells of 
origin (9, 10, 12), may underlie this observa- 
tion. Furthermore, although it is tempting to 
attribute the elevation in rates across different 
mutation classes in DFT2 to a common cause, 
it is possible that these are, in fact, unrelated, 
particularly as the magnitude of difference 
varies among mutation classes and signatures 
(Table 2). In particular, the LINE-1 retrotrans- 
position activity we observed in DFT2, but 
not in DFTI, may reflect differences in the two 
lineages’ epigenetic states (47). More generally, 
the mutation rates inferred from DFT1 and 
DFT2 provide evidence that large-scale muta- 
tions, including rearrangement events, trans- 
poson insertions, and CNVs, can have clock-like 
properties within individual cancers. 

Once arisen, mutations become subject to 
selection. Positive selection, which acts to in- 
crease the frequency of mutations that confer 
advantageous traits, is usually the dominant 
force in cancer evolution; negative selection, 
which operates to remove deleterious muta- 
tions, is also detectable in cancer, although 
weak (34). In transmissible cancers, the sto- 
chasticity of transmission may decrease the 
efficiency of selection, and neutral processes, 
such as genetic drift, are likely to be of parti- 
cular importance in their evolution (42). 
Nevertheless, and despite the small sample 
size of our study, plausible signals of positive 
selection were detectable in DFT1 and DFT2, 
and it is likely that these are operating to 
increase the fitness of cells within tumors (e.g., 
PDGFRB and PDGFRA amplification in DFT1 
and DFT2, respectively, and MGA loss of func- 
tion in DFT1) and to enhance transmission 
potential (e.g., LoY in DFT2). Genetic variants 
that increase somatic mutation rate are them- 
selves often causatively involved in cancer 
through their tendency to predispose cells to 
acquisition of secondary adaptive mutations. 
This may be exemplified in the putatively 
positively selected heterozygous truncating 
mutation in MGA that is observed in mis- 
match repair-deficient DFT1 clade E. 

Predicting the future dynamics and impacts 
of DFT1 and DFT2 requires knowledge of 
these diseases’ epidemiological parameters. 
Although estimates of basic reproductive num- 
ber (Ro) and generation time have been pro- 
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posed for DFT1 (15), considerable uncertainty 
remains. Phylodynamics methods provide 
tools for inference of epidemiological metrics 
from pathogen genomes; however, the small 
sample size and geographical structuring of 
our tumor dataset make it unsuitable for such 
analysis (43). Although we cannot predict the 
evolutionary outcomes of DFT1 and DFT2, 
one observation that is worthy of comment is 
the surprisingly long delay between the origin 
of DFT1 (1982 to 1989) and its detection 
(1996). During this interval several hundred 
devils were examined in northeastern Tasmania, 
the location of DFT1’s first observation, but no 
evidence of DFT was recorded (4). This sug- 
gests that DFT1 may have remained at low fre- 
quency during this time and is compatible with 
arelatively low Ro or a longer than expected 
generation time. This observation, together 
with that of the superspreading event that 
occurred shortly after DFT1’s origin that in- 
volved transmission of a tumor from a single 
donor to at least six recipients and founded 
the six DFT1 clades, lends credibility to the 
hypothesis that R may be overdispersed in 
DFT and that a large fraction of transmissions 
may funnel through a small number of infec- 
tious tumor donors (44). Tumor, host, and 
seasonal factors may influence individual 
transmission potential (45). 

DFT1 and DFT2 have revealed the existence 
of a biological niche suited for transmissible 
cancers in Tasmanian devils. There is no evi- 
dence that these cancers emerged as a direct 
consequence of human actions through, for 
example, the introduction of chemical carci- 
nogens or oncogenic viruses. Thus, it seems 
most likely that DFTs are a natural part of 
Tasmanian devil ecology. Although post- 
colonial human activities may have created 
conditions that indirectly benefitted DFT 
emergence or spread, for example, through 
habitat modification that may have supported 
increased devil density (46), it is very likely 
that DFTs have occurred in the past and that 
additional clones will emerge in the future. 
Notably, many incipient DFTs may die out 
before detection, particularly if these diseases 
have superspreading dynamics. Although no 
specific actions can be taken to prevent the 
establishment of additional DFTs, it will be 
important to continue close monitoring of wild 
and captive devil populations. 

Although DFT transmissible cancers might 
themselves be natural occurrences, these dis- 
eases’ devastating impact on their host species 
is exacerbated by anthropogenic threats, includ- 
ing loss of habitat and roadkill (47, 48). Several 
recent studies have used longitudinal moni- 
toring data to parameterize models that predict 
future Tasmanian devil population size and 
have argued against DFT1-induced extinction 
as a likely outcome (49-51). However, there 
is consensus that the species remains under 
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threat, particularly given that its potential 
for persistence at much reduced density is 
unknown. It is thus important that adaptive 
monitoring, research, and management con- 
tinue to be prioritized to ensure the long-term 
conservation and resilience of the Tasmanian 
devil (47, 52-54). 

Overall, this survey of the genomes of the 
two Tasmanian devil transmissible cancers 
has illuminated the evolutionary history 
of these unusual pathogens. Our analysis 
suggests that Tasmanian devils host a cell 
type that is poised for transmissible cancer 
transformation, with only minimal somatic 
genetic disruption required for these to be 
unleashed. Once established, DFT clones con- 
tinue to acquire mutations at constant rates, 
and, although most of these are neutral, a 
small subset drive further adaptation to the 
niche. The future trajectories of DFT line- 
ages and their Tasmanian devil hosts remain 
uncertain; however, this study provides a 
vantage point from which to further explore 
the evolution and impacts of transmissible 
cancers in this iconic marsupial species. 
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NEUROEVOLUTION 


Syncytial nerve net in a ctenophore adds insights 
on the evolution of nervous systems 


Pawel Burkhardt", Jeffrey Colgren’+, Astrid Medhus't, Leonid Digel’, Benjamin Naumann“, 
Joan J. Soto-Angel’, Eva-Lena Nordmann’, Maria Y. Sachkova?, Maike Kittelmann>* 


A fundamental breakthrough in neurobiology has been the formulation of the neuron doctrine by Santiago 
Ramon y Cajal, which stated that the nervous system is composed of discrete cells. Electron microscopy later 
confirmed the doctrine and allowed the identification of synaptic connections. In this work, we used volume 
electron microscopy and three-dimensional reconstructions to characterize the nerve net of a ctenophore, 

a marine invertebrate that belongs to one of the earliest-branching animal lineages. We found that neurons in 
the subepithelial nerve net have a continuous plasma membrane that forms a syncytium. Our findings suggest 
fundamental differences of nerve net architectures between ctenophores and cnidarians or bilaterians and 
offer an alternative perspective on neural network organization and neurotransmission. 


or more than a century, the structure 

and evolutionary origin of the animal 

nervous system have been at the center 

of much debate among biologists. Fun- 

damental progress in our structural 
understanding was put forward by Santiago 
Ramon y Cajal, who postulated that the ner- 
vous system is composed of discrete cells called 
neurons (J). This contrasts with Camillo Golgi’s 
proposition that the nervous system is a syn- 
cytial continuum. The discovery of synaptic con- 
nections between individual neurons by electron 
microscopy later confirmed Cajal’s theory. 
However, there is accumulating evidence that 
ctenophores, gelatinous marine invertebrates 
that move through the water column by ciliary 
comb rows, are among the earliest branching 
extant lineages of the animal kingdom (Fig. 1A) 
(2-5). Most ctenophore life cycles include a pred- 
atory cydippid stage during which, for some 
species, the ctenophore is able to reproduce 
only a few days after hatching (Fig. 1B) (6). 
Ancestral-state reconstruction suggests that 
the cydippid body plan is a plesiomorphic char- 
acter of ctenophores (7). 

The early split of ctenophores from other 
groups indicates that a nervous system, and 
maybe even neurons, could have evolved at 
least twice: once within the ctenophores and 
once within the lineage of the remaining ani- 
mals (8). Initiated through genomic analyses 
(2, 3), molecular and physiological features of 
the ctenophore nervous system were subse- 
quently interpreted to support this scenario 
(4, 5). In contrast to sponges and placozoans, 
ctenophores exhibit an elaborate nervous sys- 
tem consisting of a subepithelial nerve net 
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(SNN), mesogleal neurons, a sensory aboral 
organ, tentacle nerves, and diverse sensory 
cells in all parts of their body (Fig. 1C and 
movie S1) (9-14). Deciphering the develop- 
ment, structure, and function of the ctenophore 
nervous system is a key element to understand 
the origin and evolution of animal nervous 
systems. We have recently shown that a large 
repertoire of lineage-specific neuropeptides 
has evolved in the ctenophore Mnemiopsis 
leidyi (14). Furthermore, we identified a dis- 
tinctive feature of SNN neurons: the multiple 
neurites extending from one soma are inter- 
connected through anastomoses and thus form 
an extensive continuous network within a single 
nerve-net neuron (J4). This characteristic sets 
them apart from other animal neurons. Addi- 
tionally, there was little evidence on how these 
nerve-net neurons connect to each other, to sen- 
sory neurons, and to cells within the mesoglea 
because of the lack of synaptic markers suit- 
able for fluorescent labeling or large-scale elec- 
tron microscopic data that spans multiple 
neurons. In this study, we used high-pressure 
freezing-fixation techniques in combination 
with serial block face scanning electron micros- 
copy (SBFSEM) to establish the first ultra- 
structural three-dimensional (3D) network 
of SNN neurons and other cell types in a 
ctenophore. 


The cydippid SNN is organized in a syncytium 

Recent 3D reconstruction of a nerve-net neu- 
ron in a cydippid-phase M. leidyi has revealed 
a wide network of anastomosed neurites ex- 
tending from only one soma (/4). However, to 
understand the nature of connections between 
multiple nerve-net neurons as well as other 
cell types, we collected a larger continuous 
SBFSEM dataset of an early cydippid that 
includes 5 nerve-net neurons, 6 mesogleal 
neurons, and 22 putative sensory cells. The 
neurites of all five SNN cells were connected 
through an anastomosed continuous network 
(Fig. 2A). Whereas gap junctions could readily 
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mC 
be identified within comb plates (fig. S1)| Chec 

: i upd 
we detected neither electrical nor chem... 
synapses between the cells of the SNN. We 
confirmed this observation in smaller data- 
sets of the nerve net beneath two comb rows 
and along the gut in two other cydippid in- 
dividuals (fig. S2). Additionally, injection of 
the fluorescent lipophilic dye 1,1'-dioctadecyl- 
3,3,3',3'-tetramethylindocarbocyanine per- 
chlorate (Dil) into only one of the cells of 
two-cell staged embryos led to a fluorescent 
signal in only one-half of the cydippid body, 
which was seen in SNN cell bodies through- 
out the animal consistent with the syncytial 
nature of the SNN (fig. S3). 

Morphologically, neurites within the SNN 
exhibited no obvious polarity (axon versus 
dendrite), showing similar diameter, dense-core 
vesicles throughout their length, and the lack 
of typical presynaptic triads (Fig. 2, A to C). 
Moreover, SNN neurites often showed a blebbed 
or “pearls-on-a-string” morphology (Fig. 2, D 
to G, and fig. S4). The narrow segments were 
often just wide enough for microtubules to 
pass (Fig. 2G and fig. S4), and bulged seg- 
ments often contained larger clear or electron- 
dense vesicles and occasionally endoplasmic 
reticulum (Fig. 2D and fig. S4). A recently de- 
veloped antibody against the neuropeptide 
ML02736a (14) confirmed the presence of neu- 
ropeptides within some of the vesicles of SNN 
neurons (Fig. 2E and fig. S5). Although SNN 
neurons seemed to lack synapses between 
each other, we identified chemical synapses 
from the SNN to polster cells (fig. S6), which 
suggest directional signal transmission from 
the SNN to effector cells. 


Mesogleal neurons form direct contacts with 
the syncytial SNN 


We identified and reconstructed six mesogleal 
neurons exhibiting a starlike morphology with 
extensive plasma membrane protrusions of 
variable lengths (Fig. 3A). Their somata were 
filled with a variety of vesicles and larger vac- 
uoles (Fig. 3B), and the protrusions of these 
cells did not show the pearls-on-a-string 
morphology present in neurites of the SNN. 
Some of the protrusions formed plasma mem- 
brane juxtapositions to neurites of the SNN 
(Fig. 3, A, D, and E). However, we did not find 
ultrastructural evidence for electrical or chem- 
ical synapses (Fig. 3E). In contrast to SNN 
neurons, we did not observe any electron- 
dense vesicles in mesogleal neurons (Fig. 3B), 
but instead small electron-lucent vesicles of a 
similar size as synaptic vesicles (Fig. 3C), 
which suggests a different type of information 
transmission. 


Sensory cells form simple circuits involving 
the syncytial SNN 


We identified and reconstructed a total of 
22 putative sensory cells from the present 
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Fig. 1. Ctenophores and their 
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complex life cycle stages, 
including a predatory cydippid 
phase that hatches from the 
egg and can reproduce after a 
ew days. (C) 3D reconstruction 
of the nerve net, comb rows, 
sensory cells, mesogleal neu- 
ons, and a tentacle from 
SBFSEM data of a 1-day-old 
cydippid. (Inset) Phase contrast 
image of a 1-day-old cydippid. 
White box, area reconstructed 
in (C). Scale bar, 100 um. 
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and an earlier dataset (14) that fit into five 
morphological groupings (Fig. 4, fig. $7, and 
table S1). Some of them resembled known 
ctenophore sensory cell types (types 1, 4, and 5) 
(16, 17), whereas others exhibited a morphology 
that, to the best of our knowledge, has not 
been described previously (types 2 and 3) 
(Fig. 4, fig. S7, and table S1). We detected 
chemical synapses in several but not all puta- 
tive sensory cells that contact neuronal or 
other effector cells (Fig. 4 and fig. $7). Type 1 
sensory cells exhibited a single long cilium 
and onion-root basal body (Fig. 4 and fig. S7, 
A and B). Type 2 sensory cells exhibited a 
very short single cilium without an onion-root 
basal body. Long neurites extending from 
their somata formed chemical synapses to 
polster cells (Fig. 4B and fig. S7, A and C). 

Type 3 sensory cells exhibited multiple cilia 
without onion-root basal bodies. Many large 
electron-dense vesicles are localized beneath 
the cilia (Fig. 4C and fig. $7, A and D). We 
found one of these cells near the tentacle 
with a synaptic connection to a mesogleal 
neuron (Fig. 4C). Type 4 sensory cells ex- 
hibited a single long filopodium. Some of 
them formed synapses to neurites of the SNN 
(Fig. 4, A and D), and some also received syn- 
aptic input from type 1 sensory cells (Fig. 4A). 
Type 5 sensory cells exhibited multiple long 
filopodia. They formed plasma membrane 
contact to polster cells, but we did not detect 
synaptic contacts from or to this cell type. 
Last, we used the 3D ultrastructural evidence 
to identify several discrete and simple neural 
circuits in early cydippid-phase M. leidyi. 
These circuits included synaptic signal trans- 
mission from sensory cells to other cell types 
including SNN neurons, mesogleal neurons, 
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polster cells, or even other sensory cell types 
(Fig. 4, A to D). 


Discussion 


In the debate at the end of the 19th century 
about the organization of the animal nervous 
system, Joseph von Gerlach (1871) (78) and 
Camillo Golgi (1885) (29) put forward the 
reticular theory (also known as the syncytial 
theory). Both proposed the cellular continu- 
ity of neurons. This view was challenged by 
Cajal (1888) (7), who proposed an organiza- 
tion from discrete cellular units connected 
through synapses. Both contestant theories 
were founded on Golgi’s newly invented black 
staining that enabled scientists to study the 
detailed morphology of neurons and their 
neurites (20). Golgi and Cajal were honored 
with the Nobel Prize in Physiology or Medicine 
in 1906 for their efforts in elucidating the 
architecture of the nervous system (20). How- 
ever, with the advent of electron microscopy 
in the 1950s and the discovery of the synaptic 
cleft, the reticular theory was put to rest in 
favor of Cajal’s hypothesis (27, 22). In our 
present study, volume electron microscopy 
revealed the 3D ultrastructural architecture 
of the SNN in an early cydippid-phase cteno- 
phore, providing evidence for its reticular—or 
syncytial—organization. Previous work sug- 
gested anastomosed nerve cords in adult cteno- 
phores on the basis of chemical staining (9) and 
multiple parallel strands of neurites stained 
with anti-tyrosylated-o-tubulin (J0). In this 
work, we showed that a syncytial nerve net 
already exists in cydippid-phase M. leidyi. 
This syncytium may be reinforced in adult 
animals through the anastomosis of addi- 
tionally formed neurites; however, confirma- 
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tion of such connectivity will require further 
detailed, high-resolution analysis of the nerve 
net throughout development. 

Using high-pressure freezing and freeze 
substitution techniques to preserve fine ultra- 
structural details with minimal fixation artifacts, 
we showed that the SNN forms a continuous 
structure. This is further supported by the 
unrestricted spread of Dil throughout the 
nerve net. 

Whereas gap junctions could be identified 
within the comb plates, as previously reported 
(5) in our SBFSEM data as well as in TEM 
micrographs, we found no evidence of similar 
structures between neurites of nerve-net neu- 
rons that would suggest the presence of electrical 
synapses. Additionally, a recent characterization 
of the complete set of M. leidyi innexins— 
responsible for the formation of gap junctions 
in invertebrates—did not show any mRNA ex- 
pression in in situ hybridization experiments 
in nerve-net cell bodies (23). However, we 
did observe synaptic triads and plasma mem- 
brane contacts of unknown molecular struc- 
ture that connect the SNN externally to 
polster and mesogleal neurons. 

Previous characterizations of ctenophore 
nerve nets have been predominantly based 
on traditional histochemical staining tech- 
niques (9, 24) and more recently on fluo- 
rescence microscopy of antibody staining 
against a-tubulin (10, 12, 13, 25). Although 
both techniques provide valuable insight 
into the general organization and location 
of ctenophore neurons, they do not allow the 
investigation of the ultrastructure and nature 
of neuronal connections. Data from trans- 
mission electron microscopic serial sections 
(26, 27) may also have overlooked this distinct 
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Fig. 2. Connectivity and ultra- 
structure of the ctenophore 
SNN. (A) 3D reconstruction of 
five SNN neurons. White aster- 
sks indicate examples of the 
continuous membrane between 
the cell bodies of neurons 1 and 
2. (B) 3D reconstruction of the 
SNN neuron cell bodies showing 
the nucleus (blue) and dense- 
core vesicles (orange). (©) TEM 
cross section of an SNN 

neuron cell body that shows 
Itrastructural details, including 
arge, dense-core vesicles (white 
arrowhead). Scale bar, 1 um. 
(D) TEM cross section of a SNN 
neurite with dense-core and 
clear-core vesicles localized in 
blebbed areas (white and orange 
arrowheads, respectively). Scale 
bar, 500 nm. (E) Antibody 
staining against neuropeptide 
LO2736a (green) in SNN neu- 
ites (magenta) stained for 
bulin. (F) TEM 3D reconstruc- 
ion of SNN neurite (violet) and 
dense-core vesicles (orange), 
highlighting the blebbed mor- 
phology. (G) TEM cross section 
of SNN neurites showing contin- 
ous microtubules (orange 
arrows) passing through narrow 
segments. Scale bar, 500 nm. 


neuron 1 
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syncytial architecture because of the diffi- 
culty to produce continuous section series 
over such a large volume. Aside from reports 
on single self-anastomosing neurites in other 
animals (28-30), the presence of a complete 
syncytial nerve net has only been reported for 
the cnidarian, medusae-like colonial polyp, 
Velella (31, 32). However, to the best of our 
knowledge, the syncytial organization of this 
nerve net has not yet been verified on an ultra- 
structural level. At this time, we have found this 
feature only in the ctenophore MW. leidyi nerve 
net, but further analysis across nerve net- 
bearing animals may provide exciting insights 
into early nervous system evolution and modes 
of neuronal connectivity. 

Although neurite fusion and pruning seem 
to be a common principle during the early 
neural development in many animals (33, 34), 
we do not consider the syncytial cydippid SNN 
to be completely remodeled by such a pro- 
cess later in development. It was suggested 
that the early cydippid phase is not a larval 
phase but rather an autonomous life history 
phase of M. leidyi and other ctenophores (6). 
Indeed, cydippid-phase M. leidyi are free- 
swimming pelagic predators, able to reproduce 
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neuron 5 


and exhibit complex behaviors as described 
for their second, reproductive, lobate phase 
(35-37). 

Our identification of the nonsynaptic archi- 
tecture of the cydippid-phase SNN raises the 
question of the mechanism of signal propaga- 
tion. Genome and single-cell transcriptome 
analyses revealed that M. leidyi SNN neu- 
rons express 1 voltage-gated calcium (Ca,), 
35 potassium (K,), and 2 nonspecific sodium 
(Nay) channels (14, 38, 39). These numbers are 
similar to those in neurons of other animals, 
and ctenophore SNN neurons may therefore 
be able to produce membrane potential or 
even action potentials (40). Moreover, the 
presence of numerous peptidergic vesicles 
in the SNN suggests that signal transmission 
also occurs through neuropeptide release, and 
the Ca, channel expressed in these cells might 
be involved in exocytosis (14, 47). Therefore, 
we can speculate that the SNN could function 
as a neuroendocrine system that is able to 
release transmitters into the mesoglea through 
vesicle fusion with the plasma membrane at 
different neurite sites. Such a system would 
require only a minimum number of chemical 
synapses and, if acting at short distances, may 
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reach enough effector cells. Studies on the con- 
duction velocity in ctenophores have shown a 
slower speed of signal propagation compared 
with that of nerve nets and conducting epithelia 
of other animals (42), indicating that signal 
propagation could be nonsynaptic. 
Additionally, our ultrastructural identifica- 
tion of simple circuits now provides a basis 
that allows for better understanding of how 
mechanoreception, swimming, and prey-capture 
behavior in young cydippid-phase ctenophores 
could be facilitated. Numerous sensory neurons 
are connected through chemical synapses to 
the nerve net, which in turn forms chemical 
synapses onto effector cells such as the comb 
rows or ciliated groove cells (J4). Type 1 cil- 
jated sensory cells and type 4 filopodiated sen- 
sory cells, previously described as Tastborsten 
and Taststifte (9), have been postulated to 
be sensitive to water vibrations and touch 
(17, 43, 44). Their abundance throughout the 
epidermis and direct cell-to-cell contact to the 
nerve net (many through chemical synapses) 
underscore the importance of direct trans- 
mission of localized vibration and touch in- 
formation to the SNN. Morphological analysis 
allows us to speculate that a type 2 sensory 
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Fig. 3. Close association 
of mesogleal neurons and 
the SNN. (A) 3D recon- 
struction of SNN (violet) 
and mesogleal neurons 
(yellow) from SBFSEM data. 
(B) TEM cross section 

of a mesogleal neuron cell 
body. Different types of 
clear vesicles and vacuoles 
but no dense-core vesicles 
are present. Scale bar, 
lum. (€) 3D reconstructed 
mesogleal neuron with 
three long neurites that 
contain small clear vesicles 
(blue arrowheads). (Inset) 
TEM cross section of meso- 
gleal neurites with small 
clear vesicles. Scale bar, 
inset, 200 nm. (D) 3D 
econstruction of mesogleal 
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synapse structures could be observed. Mn, mesogleal neuron. Scale bar, 500 nm. 


Fig. 4. 3D reconstruction of 
sensory cells allows for the 
identification of simple 
circuits. (Top) Localization of 
each circuit (pink square). 
(Middle) 3D reconstructions of 
sensory and effector cells. 
Mitochondria are shown in 
yellow to represent synaptic 
tripartite complexes in al 
circuits. (Bottom) Proposed 
wiring diagram. (A) Circuit 
between type 1 and type 4 
sensory cells and SNN. 
(B) Multiple synaptic connec- 
tions between a type 2 sensory 
cell with short cilium and 
comb cells. (€) Synaptic con- 
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neuron with contact site (white box) to SNN. (E) Corresponding SBFSEM image of contact site between mesogleal neuron and SNN neuron. No chemical or electric 
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nection between a type 3 
sensory cell near a tentacle 
and a mesogleal neuron. 

(D) A type 4 sensory cell with 
single filopodium synapses 
onto nerve net. 
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cell, which wraps around polster cells, may be 
able to detect water flow and thus alter comb- 
beat frequency, whereas a type 3 sensory cell, 
the multiple cilia of which are in close contact 
with the tentacle, may be triggered by food cap- 
ture. Functional experiments are needed to 
fully understand the activity of these circuits 
and to unravel the different modes of signal 
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transmission used by the different ctenophore 
neuronal cell types. This study is limited to the 
analysis of an early developmental stage in 
which fixation of whole animals with high- 
pressure freezing is still possible. Comparison 
with other ctenophore species and investiga- 
tion of later life history stages of M. leidyi are 
needed to clarify whether a syncytial SNN is a 
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feature restricted to an early ontogenetic phase 
in only a few species or is a common feature of 
all ctenophores. This approach will also pro- 
vide valuable insights into the development of 
the syncytial SNN on whether neurons divide 
but remain connected in the cydippid SNN, or 
whether neurites from different cell bodies 
reach out and fuse. 
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Whether neurons of animals have a single 
origin or possibly originated more than once 
during evolution is a debated topic. The exist- 
ing data on the ctenophore nervous system 
show a specific mosaic of cellular and syncytial 
components with distinct evolutionary his- 
tories. It will be a major future challenge to 
clearly identify the parts of the mosaic that 
may have evolved independently and the 
preexisting parts that were strongly modi- 
fied. Our study underscores that the resem- 
blance between the nerve net of ctenophores 
and the nerve nets of cnidarians and bilaterians 
might only be superficial because it appears 
that their connectivity is fundamentally differ- 
ent. Our ultrastructural analysis of the cteno- 
phore SNN not only puts ctenophores at the 
center of nervous system evolution but also 
provides an opportunity to explore the boun- 
daries of nervous system organization and 
function. 
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The value of private properties for the conservation 
of biodiversity in the Brazilian Cerrado 


Paulo De Marco Jr.’*, Rodrigo A. de Souza“, André F. A. Andrade’, Sara Villén-Pérez°, 
Caroline Corréa Nébrega‘*, Luiza Motta Campello°, Marcellus Caldas® 


Areas set aside for conservation within private lands may be key to enhancing biodiversity-friendly 
landscapes. This conservation strategy should be especially effective in highly threatened regions that 
are poorly protected by public lands, such as the Brazilian Cerrado. Brazil’s Native Vegetation Protection 
Law has included set-aside areas within private properties, but their relevance to conservation has 
not been evaluated. We assess whether private lands are contributing to biodiversity in the Cerrado, 
a global biodiversity conservation priority and major region for food production, where land use 
conflicts are often at odds with conservation objectives. We determined that private protected areas 
accommodate up to 14.5% of threatened vertebrate species ranges, which increases to 25% when 
considering the distribution of remaining native habitat. Moreover, the spatial spread of private 
protected areas benefits a large number of species. Ecological restoration of private protected lands 
would improve the benefits of this protection system, especially in the Southeastern Cerrado, where a 


large economic hub meets a threat hotspot. 


rotected areas are the cornerstones 

for the long-term conservation of bio- 

diversity. They cover about 15% of the 

terrestrial surface and 7.3% of the ocean 

surface (J), and global analyses show 
that they are still insufficient to protect bio- 
diversity (2). The need for complementary 
strategies to join or reinforce protection 
networks is especially urgent to deal with 
the lack of connectivity due to habitat frag- 
mentation. A promising approach is to make 
landscapes that are now occupied by eco- 
nomic activity more “biodiversity-friendly” 
(3). Biodiversity-friendly landscapes seek to 
preserve habitat patches in human-dominated 
areas to favor the persistence of native species 
(3), including beneficial animals such as pol- 
linators, predators, and fruit dispersers (4). 
Most human-dominated areas are under pri- 
vate ownership, and this represents a large 
proportion of global land, varying from 44.2% in 
Brazil (5) to 52% in Germany, 75% in the United 
States (excluding Alaska) (6), and nearly 80% 
in the United Kingdom and Spain (7). Thus, 
improving the biodiversity-friendliness of pri- 
vate landholdings could amplify the benefits 
of the existing protection system by increas- 
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ing both the total habitat available and the 
connectivity among remaining habitats (8), 
ensuring population persistence and richer 
biodiversity (9). However, conservation pri- 
oritization efforts have often overlooked 
the role of private lands while focusing on 
public protection networks (10). Here, we 
provide an evaluation of the relevance of 
set-aside areas of private land in one of the 
most important and vulnerable worldwide 
arenas for the conflict between food pro- 
duction and biodiversity conservation: the 
Brazilian Cerrado (11). In addition, we present 
potential scenarios for restoration priorities 
to optimize the protection of 103 threatened 
terrestrial vertebrates in the biome. 
Biodiversity-friendly landscapes must be de- 
signed to increase connectivity among habitat 
patches and to maintain sufficient habitat to 
assure the long-term persistence of biodiversity 
(8, 12). The strategy to implement this ap- 
proach varies widely across the land-sharing 
and -sparing continuum and from voluntary 
to mandatory actions implemented by differ- 
ent countries (73). In parts of Australia, for 
example, there is a well-established model in 
which some rights are voluntarily relinquished 
in favor of conservation under a binding legal 
agreement and in exchange for economic in- 
centives (4). Similar approaches are also found 
in the United States and Canada (15). In Latin 
America, the scheme is similar but shows a 
larger participation of nongovernmental or- 
ganizations (NGOs) in land purchase for con- 
servation, especially in Costa Rica, Ecuador, 
Argentina, and Chile (16). Otherwise, man- 
datory regulations to protect a portion of 
every rural property may represent a mean- 
ingful strategy for conservation in largely 
human-dominated landscapes. One of the 
best-established examples of this policy was 


2023 


¢ 


implemented in the Brazilian Forest C Chec 

upd 
almost a century ago (federal decree no. 23.-2<, 
1934 and federal law no. 4.771/1965). It was 
originally conceived under a utilitarian view 
that focused on the importance of vegetation 
to water resources, soil fertility, and wood 
storage within rural properties (77). Even so, 
the Forest Code has important implications 
for biodiversity conservation today. 

To enforce the sustainable use of natural 
resources, the Brazilian Forest Code required 
rural owners to select patches to become 
legal reserves within their rural properties. 
The legal reserve area varies between 20% 
(criteria for most of Brazil) to 80% (for the 
Amazon) of the property area. By contrast, 
the location of permanent protection areas 
is not eligible because they are designed to 
protect geological stability (e.g., topographic 
slope higher than 45°) and water resources 
(e.g., areas around streams, rivers, and springs). 
The existence of legal reserves and permanent 
protection areas has suffered long-standing 
pressure from political and economic sectors, 
with reiterated attempts to change this legis- 
lation during its history. As a consequence, 
some changes were implemented in the 2012 
Native Vegetation Protection Law (federal 
law no. 12.651/2012) to maintain the general 
definition for existent categories but allow 
new deforestation, mainly in the Cerrado 
biome. The 2012 Forest Code also demands 
that landowners provide georeferenced infor- 
mation about land uses and protected areas 
in their rural properties through the Rural 
Environmental Registry [Cadastro Ambiental 
Rural (CAR)]. Here, we take the opportunity 
created by CAR to analyze the spatial dis- 
tribution of all private protected areas from 
684,942 rural properties registered in the 
Cerrado biome to assess its potential value 
for the conservation of threatened vertebrate 
species and to predict the potential benefits 
of fully restoring set-aside areas. The Cerrado 
is a wooded grassland, or savanna, covering 
about 20% of Brazil. It is home to distinctive 
and threatened species, such as the maned wolf 
(Chrysocyon brachyurus), the giant anteater 
(Myrmecophaga tridactyla), and the highly en- 
dangered blue-eyed ground dove (Columbina 
cyanopis). By 2018, natural vegetation loss 
reached 90 million ha—45% of Cerrado area— 
mostly in private property (/7, 18). 

All analyses are based on the overlap be- 
tween model predictions of species’ ranges 
and the proportion of private protected areas 
with a 10-km-by-10-km cell from CAR’s polygons. 
We used conservative estimates of species’ 
ranges based on advanced ecological niche 
modeling techniques that account for dis- 
persal constraints (19). Landscape-cell rele- 
vance to conservation was estimated by giving 
higher weight to smaller-ranged species, which 
are proportionally more affected by habitat 
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Fig. 1. The proportion of distributional range of threatened vertebrate spe- 
cies that falls within legal reserves and permanent protection areas in 
relation to its range size in the Cerrado biome. (A to F) Historical species’ 
ranges that overlap legal reserves (LR) (A) and permanent protection areas 
(PPA) (B), as estimated by the regression through the origin (black lines), are 
13.01 and 3.21%, respectively. The null expectation (red dashed line) is that each 
species overlaps both categories according to the proportion of those classes 
across the whole Cerrado area (12.80% for legal reserves and 4.20% for 
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distribution within total private protected areas (C) is 14.48%. The portion of 
species’ ranges with available remaining habitat that overlaps with legal reserves 
(D) and permanent protection areas (E) is larger (black lines; 23.06 and 
5.49%, respectively) and presents a larger interspecific variation but is close to 
the null expectation (red dashed line). After discounting habitat loss, the 
predicted mean proportion of species distributions within total private protected 
areas (F) is high (25.04%) and close to the null expectation. The estimate 

of the overlap (slope of black line) is indicated in each plot, together with the R* 


permanent protection areas). The predicted mean proportion of species and its statistical significance. 


loss. Moreover, we assumed that species can 
persist in the private protected patches inde- 
pendently of their size, isolation, or type of 
surrounding matrix because species-specific 
sensitivity to these variables is unknown for 
most of the species we evaluated (20). We 
start by assuming that private protected areas 
are fully restored, though a considerable part 
of those set-aside areas is, at present, not well 
preserved and suffers from human interfer- 
ence (71). This assumption is relevant because 
their restoration is mandatory even under the 
current Forest Code (27). Thus, our analysis 
assesses the conservation value if restoration 
is properly implemented. Finally, we explore 
this further by indicating where restoration 
will bring higher benefits to biodiversity. 

We show that an average of 13.01% of the 
range of threatened species falls within legal 
reserves [slope of the regression of range 
within legal reserves and total species’ range 
in Cerrado, forced through the origin; coeffi- 
cient of determination (R?) = 0.987; Fig. 1A]. 
This value is only slightly higher than the null 
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expectation, which is the percentage cover of 
legal reserves in Cerrado (12.86%). The sim- 
ilarity to the null expectation suggests that 
these areas are representative of the envi- 
ronmental variation in the Cerrado favoring 
better representation of species’ distribution 
ranges of threatened vertebrates. This hypothesis 
was supported both by the frequency distribution 
of public and private areas in relation to first 
climatic principal components analysis (PCA) 
and by the overlap of the entire environmental 
variation of the Cerrado (figs. S2-1 and S2-2). 
The mean proportion of the predicted species 
ranges that fall within permanent protection 
areas is lower (3.21%; the slope of the re- 
gression; R? = 0.944; Fig. 1B). This prediction 
is also slightly lower than the null expectation 
that species overlap is only determined by 
the proportion of this category in the whole 
Cerrado (4.26%). Applying the same analysis 
for the current public (federal and state lev- 
els) protected area system in the Cerrado 
shows that only 13.78% of species’ ranges is 
protected, slightly lower than the null expec- 
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tation based on the coverage of public pro- 
tected network in this region (15.30%; fig. S2-3). 
Private protected land is more evenly distrib- 
uted across the Cerrado and thus is better 
suited to benefit a larger number of species. 
This is a desirable quality that contrasts with 
the public protection system, which is biased 
toward less-favorable lands for agriculture and 
does not represent the distribution of most 
threatened species (10). Otherwise, private 
land may represent an even higher proportion 
of threatened species’ ranges if we restrict the 
analysis to the available remnants of native 
vegetation in the Cerrado. In that case, the 
predicted mean proportion of species’ ranges 
within legal reserves is 23.06% and, for per- 
manent protection areas, is 5.49% (R® = 0.851 
for legal reserves and R° = 0.756 for permanent 
protection areas; Fig. 1, D and E). The general 
agreement to the null expectation still holds, 
but there is an increased scattering that 
suggests higher interspecies variation in their 
level of protection. This variation supports the 
nonrandom distribution of habitat loss in the 
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Fig. 2. Priority areas for the restoration of private lands based on their 
contribution for threatened species conservation. (A and B) Identification 
of restoration priorities within private protected areas (A) and their 
distribution in the Cerrado (B) based on their relevance to threatened species 
conservation, considering both the biodiversity relevance index and private 
protected area size. (C) Prediction of the conservation milestones that would 


Cerrado, which causes different levels of ex- 
posure among species (20). 

The combined effect of legal reserves and 
permanent protection areas may protect up to 
14.48% of species’ distribution ranges in the 
Cerrado (R” = 0.964; Fig. 1C). This prediction 
increases to 25.04% after discounting for cur- 
rent habitat loss outside private protected 
assigned areas (R? = 0.791; Fig. 1F). This cov- 
erage is consistently higher than that expected 
from the distribution of total private protected 
area in the Cerrado (9.7%) and the remain- 
ing habitat in these protected areas (19.7%). 
Our results show that private lands may pro- 
tect nearly 25% of the remaining climatically 
suitable habitats for threatened vertebrates 
in the Cerrado, so that its relevance for con- 
servation is much higher than is now as- 
sumed. This also evidences the importance 
of habitat restoration within set-aside pri- 
vate lands, which is expected to occur under 
the current legal system. First, an increase in 
habitat amount due to restoration of private 
protected areas is expected to increase spe- 
cies’ population sizes and reduce their risk 
of extinction (22). In addition, an increase in 
connectivity among remaining habitats is 
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expected to favor species’ dispersion and per- 
sistence in the landscape (3). 

Different private lands are not equally im- 
portant to species conservation across the 
Cerrado region. The spatial variance of the 
relevance to the conservation index (Fig. 2, A 
and B) shows that a small set of 192 cells, 
mostly distributed in the most highly affected 
Sao Paulo state, has a disproportional impor- 
tance to conservation. Those areas are part of 
the distribution of at least 70 small-ranged 
species and still bear a relatively large amount 
of protected land to restore within its pre- 
dicted distribution. The entire set retains 
nearly 145,000 ha of protected private land, 
with an estimated cost of restoration not 
higher than $60 million based on assisted 
regeneration methods, which is only 0.02% 
of the exports value of the Brazilian agri- 
business sector in 2021 (https://indicadores. 
agricultura.gov.br/agrostat/index.htm). An 
analysis of potential scenarios for the prior- 
itization of areas within set-aside private 
lands shows that after ordering all Cerrado 
cells according to their relevance for conser- 
vation, the cumulative protected private land 
area points to a positive scenario. Restoration 
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be reached as an increased number of private protected areas are 

restored after the prioritization established in (A). Conservation milestones 
are determined in terms of the number of species that would benefit and 
the percentage of their range that would be preserved. The dotted, dashed, 
and solid blue lines represent the conservation targets of 10, 15, and 20% of 
species range size, respectively. 


of the 10% top-priority cells will achieve the 
goal of 10% range protection for 10 threat- 
ened species, a 25% restoration will achieve 
the same goal for 26 species, and a 50% res- 
toration will achieve the same goal for 49 
species. More ambitious conservation targets 
(15 or 20% of species’ range protection; Fig. 2C) 
are attained only for a small number of spe- 
cies or under optimistic restoration scenarios. 
For instance, restoring 75% of the protected 
private land would protect 20% of the range 
of 17 species, which includes many small- 
ranged species that are well represented in 
the Cerrado biome. We argue that an explicit 
policy to assure restoration will return clear 
conservation benefits based on those scenarios. 

Our results support the importance of pri- 
vate lands to the protection of threatened 
Cerrado species. They indicate that restoring 
private protected areas is an important con- 
servation goal that deserves special funds and 
attention. We show that, at least for the con- 
servation of threatened terrestrial vertebrate 
species, it is possible to devise a prioritization 
scheme to guide the restoration efforts of those 
areas. In addition, private protected area res- 
toration would also have direct effects on 
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essential ecosystem services. Based on recent 
calculations of carbon storage in Cerrado areas 
(23), we made a conservative estimate that 
shows that the restoration of private pro- 
tected areas could capture 12 x 10° tonnes of 
carbon, which is a substantial contribution 
toward the 2°C climate target (24). Restor- 
ing private set-aside areas may also improve 
pollination services for major crops, such 
as soybean, and other relevant croplands of 
fruits and vegetables. Although those services 
are not always recognized by private owners 
(25), private land protection still carries the 
possibility of increasing the visibility of its 
benefits, thereby boosting restoration efforts. 
The choice to dedicate land and resources for 
biodiversity conservation is political and in- 
fluenced by the value that people place on 
biodiversity (26). Conservation in private lands 
may increase the perception of ecosystem ser- 
vices and promote willing-to-conserve atti- 
tudes (27), thus reinforcing society's positive 
view of biodiversity conservation. 
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Structure of the R2 non-LTR retrotransposon 
initiating target-primed reverse transcription 


Max E. Wilkinson'?+5, Chris J. Frangieh’?*+>°, Rhiannon K. Macrae’?**°, Feng Zhang>?345* 


Non-long terminal repeat (non-LTR) retrotransposons, or long interspersed nuclear elements (LINEs), 
are an abundant class of eukaryotic transposons that insert into genomes by target-primed reverse 
transcription (TPRT). During TPRT, a target DNA sequence is nicked and primes reverse transcription 
of the retrotransposon RNA. Here, we report the cryo—electron microscopy structure of the Bombyx mori 
R2 non-LTR retrotransposon initiating TPRT at its ribosomal DNA target. The target DNA sequence is 
unwound at the insertion site and recognized by an upstream motif. An extension of the reverse 
transcriptase (RT) domain recognizes the retrotransposon RNA and guides the 3’ end into the RT active 
site to template reverse transcription. We used Cas9 to retarget R2 in vitro to non-native sequences, 
suggesting future use as a reprogrammable RNA-based gene-insertion tool. 


on-long terminal repeat (non-LTR) ret- 

rotransposons are the most abundant 

class of mobile genetic element (MGE) 

in the human genome, primarily rep- 

resented by the LINE-1 and SINE (or 
Alu) long and short interspersed nuclear ele- 
ments, respectively (1). Despite their prevalence 
and contribution to genetic diversity and dys- 
regulation through mutagenicity and recombi- 
nation (J-3) and their prospective use as gene 
insertion tools, there is much left to understand 
about the mobility mechanisms of non-LTR 
retrotransposons (4). Pioneering research 
on the Bombyx mori (silk moth) R2 element 
(R2Bm), which selectively inserts into the 28S 
ribosomal RNA (rRNA) gene, has contributed 
substantially to our understanding of this type 
of MGE (5). R2, like all non-LTR retrotranspo- 
sons, encodes an open reading frame (ORF) 
with DNA binding, endonuclease, and reverse 
transcriptase activities (Fig. 1A). The endonu- 
clease domain (restriction-like endonuclease, 
RLE) nicks the target DNA, and the reverse 
transcriptase domain uses the exposed 3’ end 
from the nick to prime reverse transcription 
of the R2 RNA, resulting in a new genomic 
copy of the R2 element (Fig. 1B) (6, 7). This 
process is called target-primed reverse tran- 
scription (TPRT) and is characteristic of non- 
LTR retrotransposons and their group II intron 
ancestors (8, 9). The nicked strand that primes 
reverse transcription is referred to as the bot- 
tom strand. Complementarity between the bot- 
tom strand and the 3’ end of the R2 RNA (3 
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homology) is not required to initiate reverse 
transcription (J0). Non-LTR retrotransposons 
are specific for reverse transcribing their own 
RNA; for R2, this specificity requires an ele- 
ment in the 3’ untranslated region (3’UTR), 
but the precise motif has not been located (77). 
It is also unclear how R2 specifically recog- 
nizes the 28S rRNA target gene, or how DNA 
nicking is coupled to reverse transcription 
within the same protein. To address these 
questions, we solved a cryo-electron micros- 
copy (cryo-EM) structure of R2Bm initiat- 
ing TPRT at the 28S rRNA gene using its own 
3'UTR. The structure reveals an extensive in- 
terface with the target DNA, a small core re- 
gion of the 3'UTR required for TPRT, and shows 
that R2Bm can be engineered to reprogram its 
insertion site. 


Reconstitution and cryo-EM structure of an 
R2 TPRT complex 


We overexpressed R2Bm in Escherichia coli 
and purified it to apparent homogeneity (fig. 
S1). The purified protein was active in vitro, 
reproducing previously found biochemical ac- 
tivities, including RNA-stimulated nicking of 
the target DNA bottom strand, site-specific 
TPRT when supplied with in vitro-transcribed 
3'UTR RNA, and low levels of template jump- 
ing (Fig. 1C) (6, 72). It is unclear if 3’ homology 
is required for TPRT in vivo; however, con- 
sistent with previous findings, we found that 
downstream sequences of up to 10 nucleotides 
(nt) do not inhibit activity in vitro (Fig. 1C) 
(10). Sequencing of TPRT junctions confirmed 
that homology-mediated TPRT is more likely 
to initiate reverse transcription at the 3’ end 
of the 3’UTR rather than skipping bases or 
inserting untemplated nucleotides (fig. S2) (0). 
To assemble a complex stalled during initiation 
of TPRT, we incubated R2Bm with target DNA, 
3'UTR RNA, and the chain-terminator nucle- 
otide 2',3'-dideoxythymidine (ddT), which mi- 
mics the first nucleotide incorporated in the 
TPRT reaction (dT) but does not allow further 
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elongation. Purified TPRT complexes conta: Chee 
stoichiometric amounts of R2Bm, 3'UTR R-,— 
and target DNA with >99% of the bottom 
strand nicked (fig. $1). Initial attempts at cryo- 
EM imaging failed owing to the preferred 
orientation and flexibility of the complex. To 
overcome these issues, we used a carbon sup- 
port on the cryo-EM grid and added 5 nt of 
downstream 28S rRNA sequence to the 3’ end 
of the 3'‘UTR RNA to stabilize the complex by 
forming a primer-template duplex with the 
target DNA bottom strand. With these mod- 
ifications, we obtained a cryo-EM reconstruc- 
tion of the R2 TPRT complex at 3.1-A resolution 
(Fig. 1D, figs. S3 and S4, and table S1). 

The core of the R2Bm protein is a reverse- 
transcriptase (RT) domain similar to that of 
group II intron RTs (73), followed by a C-terminal 
a-helical thumb domain and preceded by a 
characteristic N-terminal extension domain 
(NTEO) implicated in template switching (74), 
but the R2Bm RT includes a further N-terminal 
extension (NTE-1) that binds the 3‘UTR RNA 
(Fig. 1, E and F) (15). Preceding the NTE-1 
element are two DNA binding domains: the 
N-terminal C2H2 zinc finger domain (N-ZnF) 
and a Myb domain. C-terminal to the thumb 
domain lies an o-helical linker domain that 
packs against the thumb, followed by a CCHC 
zine-finger domain (ZnF) conserved in many 
LINE ORFs (4). The ZnF then links to the 
C-terminal RLE domain, which cleaves the 
target DNA. This domain arrangement closely 
resembles that of Prp8 (13, 16, 17), the core 
protein of the spliceosome, underscoring the 
close relationship between pre-mRNA splicing 
and retrotransposons. 

There are several key interactions between 
the R2Bm protein, 3'UTR RNA, and target 
DNA (Fig. 1, E and F). The two strands of the 
target DNA separate around the ZnF domain, 
with the bottom strand feeding into the RLE 
active site where the scissile phosphate remains 
bound, while the top strand snakes along the 
opposing surface of the RLE. The RT active 
site contains a heteroduplex formed by the 
nicked bottom strand of the target DNA (5' 
to the cleavage site) and the 5 nt of 28S rRNA 
homology extension beyond the 3‘UTR RNA 
(Fig. 1G). This target heteroduplex is sur- 
rounded by residues important for RT activity 
(8), and the cryo-EM density shows incorpo- 
ration of the ddT chain terminator nucleotide 
into the bottom strand (Fig. 1H). The 5’ end of 
the bottom strand remains base-paired to the 
top strand as it leaves the RLE, and this down- 
stream DNA region has weak cryo-EM den- 
sity, suggesting that it is not tightly bound by 
R2Bm. The 248-nt 3'UTR RNA is mostly not 
resolved in the cryo-EM density except for a 
core 40-nt region, which wraps around the 
NTE-1 o helix of R2Bm and the 3’ end of 
which is guided into the RT active site via the 
NTEO domain. 
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Fig. 1. Cryo-EM structure of the R2Bm retrotransposon. (A) Domains of the R2Bm retrotransposon. ZnF, zinc finger; NTE, N-terminal extension; RT, reverse 
transcriptase; RLE, restriction-like endonuclease. (B) Schematic of target-primed reverse transcription (TPRT). (C) Denaturing gel of in vitro TPRT reactions on a 
labeled 211-bp 28S DNA target. The same gel was visualized by Cy5 fluorescence and toluidine blue staining. (D) Cryo-EM density of the R2Bm TPRT complex. 
(E) Cartoon of the cryo-EM structure. Stars represent active sites. (F) Atomic model for the R2Bm TPRT complex. (G) Reverse transcriptase domain and template— 
primer duplex. (H) Reverse transcriptase active site. Cryo-EM density is shown as a gray transparent surface. 


R2Bm recognizes a sequence motif upstream 
of the cleavage site 

The target 28S DNA sequence has extensive 
interactions with R2Bm (summarized in Fig. 
2A). Upstream bases from -38 to -7 and down- 
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stream bases from +6 to +21 are respectively 
paired, whereas the 11 base pairs from -6 to 
+5 are melted around the RLE domain (bases 
are numbered relative to the bottom-strand 
cleavage site). The upstream DNA has a 40° 
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bend and binds along the surface of the RT, 
linker, and thumb domains in a manner sim- 
ilar to that of the DNA in a recent group IIC 
intron maturase structure (Fig. 2B and figs. 
S5 and S6) (19). Many of the contacts between 
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Fig. 2. Target DNA recognition upstream of the R2 cleavage site. (A) Schematic | (DGHRKK) of the RT6a loop. (D) Screen for identifying active RUM sequences. 


of interactions with the target DNA. Bases are numbered relative to the bottom- Nicking sites of R2Bm and the restriction endonuclease Nt.BbvCl are shown 
strand cleavage site. Positions of protein domains are shown by shaded rectangles. by triangles. (E) Sequence logo for sequences enriched in the RUM screen. 
(B) Structure of R2Bm around the upstream DNA sequences. (C) Effect of (F to H) Details of interactions between the target DNA and the N-ZnF, Myb, 
upstream DNA mutations on target cleavage. The schematic shows the sequences _—_and RT6a loop. (I) Effect of altering the distance between the RUM and RASIN 
of five DNA sequences tested in top-strand sense; dots represent bases identical motifs. Denaturing gel shows in vitro TPRT reactions on labeled 211-bp 28S DNA 
to those of wild type. Red triangle, bottom-strand cleavage site. Denaturing gels targets. Single-letter abbreviations for the amino acid residues are as follows: 
show in vitro TPRT reactions on labeled 211-bp 28S DNA targets. AN, deletion of A, Ala; D, Asp; E, Glu; F, Phe; H, His; K, Lys; L, Leu; P, Pro; Q, Gln; R, Arg; S, Ser; 
N-terminal N-ZnF and Myb domains. ART6a, deletion of residues 672 to 677 T, Thr; V, Val; and Y, Tyr. 
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Fig. 3. Target DNA recognition at the R2 cleavage site. (A) Interactions of the 
top and bottom strands of the target DNA with the ZnF domain of R2Bm. Star, 

RLE active site. (B) Interactions of the DNA bottom strand with the RLE domain. 

(C) Interactions of the DNA top strand with the RLE domain. Residues mutated in 
the RD>AA mutant are highlighted. (D) RASIN sequence requirements for bottom- 
strand cleavage. The labeled 211-bp 28S DNA targets were incubated with R2Bm and 
3'UTR RNA in the absence of deoxynucleotide triphosphates (dNTPs). The reactions 


R2Bm and the DNA are via the phosphate 
backbone, suggesting that they are not se- 
quence specific. Based on the structure, how- 
ever, we predicted that two regions are key for 
sequence-specific DNA recognition by R2Bm: 
a13-bp upstream motif from -34 to -22, which 
is bound by the N-terminal N-ZnF and Myb 
domains, and the 7 bp from -6 to +1, which are 
bound by the RLE (Fig. 2A). We term these re- 
gions the retrotransposon upstream motif 
(RUM) and retrotransposon-associated inser- 
tion site (RASIN), respectively. 

Consistent with the importance of the RUM 
region for R2 activity, mutating the entire up- 
stream sequence between -38 to -7 eliminated 
bottom-strand cleavage, whereas mutating the 
downstream sequences between +6 and +37 
preserved wild-type levels of bottom-strand 
cleavage and TPRT (Fig. 2C) (20). Adding just 
the 13-bp RUM region to the upstream mutant 
at positions -34 to -22 restored near-wild- 
type activity, whereas a point mutant RUM 
(G_o7 to C) did not rescue activity (Fig. 2C). 
This region of the target was strongly pro- 
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bottom strand 


i _ Op 
ZnF 


DNA top strand: - 


partially stranded 
substrates 


R2 RNA 3’ homology: 0 


RUM RASIN 
17d: bromo 


32u: Se 


60: See 


tected in a previous deoxyribonuclease (DNase) 
footprinting assay (21). To systematically de- 
termine the importance of each base within 
the RUM, we performed an R2 cleavage assay 
on a DNA target with the upstream region 
(-38 to -7) mutated and the RUM (-34 to -22) 
replaced with a 13N library (Fig. 2D). Sequenc- 
ing of cleaved targets revealed a consensus 
RUM sequence A_3;WWWGCNNNA 499, where 
W is A/T and N is any nucleotide, with minor 
preferences in other positions (Fig. 2E). This 
consensus is a close match to the wild-type 
28S sequence A_33;ACGGCGGGA_ 459, with the 
differences shown in bold. 

The RUM is recognized by three domains: 
N-ZnF, Myb, and an R2-specific insertion “6a” 
in the RT domain between motifs 6 and 7 (Fig. 
2B and fig. S7). The N-ZnF has the classical 
C2H2 fold with a zinc ion coordinated between 
an o helix and a B hairpin, but unusually the a 
helix binds in the widened minor groove of the 
DNA from bases -18 to -23 instead of the typ- 
ical major groove (Fig. 2F and fig. S6) (22). The 
preference for A at base -22 in the RUM is 
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were analyzed with a denaturing gel. Mutations are notated in top-strand sense, but 
both strands were mutated. (E) Denaturing gel showing R2Bm cleavage and TPRT 
activity on partially stranded substrates. Reactions contained a fluorescein-labeled 
76-nt bottom strand. Reactions as indicated also contained 17 nt of downstream 
top-strand sequence (17d), 32 nt of upstream top strand sequence (32u), or 60 nt 
of top-strand sequence fully complementary to the bottom strand spanning the 
upstream and downstream regions. RD>AA, R2Bm R9OQIA D902A. 


likely due to N-ZnF Arg'”’, which hydrogen 
bonds with the minor-groove-facing side of 
the A-T base pair (Fig. 2F). The Myb domain 
forms a typical three-helix bundle, with the 
third helix bound in the major groove from 
bases -31 to -34 (22) while its linker to N-ZnF 
engages with base -30 (Fig. 2G). This is remi- 
niscent of other Myb-DNA structures, in- 
cluding telomere-interacting protein Rap] (23). 
The Myb domain recognizes the A at base -31 
through hydrogen bonds with Lys"? (Fig. 2G). 
Although Arg?’ contacts bases at positions 
-33 and -34, these contacts appear not to be 
sequence specific, as the RUM screen showed 
only weak sequence preferences in this region 
(Fig. 2, E and G). Deletion of the N-ZnF and 
Myb domains together (AN mutant) complete- 
ly inhibits target DNA nicking and subsequent 
TPRT (Fig. 2C) (20). The central GC of the RUM 
is recognized by His®”? and Lys®” of the loop 
6a of the RT domain (Fig. 2H). Structural pre- 
dictions suggest that this loop is specific among 
non-LTR RT domains to R2 proteins (fig. $7). 
We found that deletion of the 6a loop inhibits 


4 of 8 


RESEARCH | RESEARCH ARTICLE 


A dete c 
R23’UTR Ss 
248 nt+ homology — 8 
additional RNA 
oe density 
“< cryo-EM density 
low-pass filtered 
A Ram L732 
ca C125] 
= = A128 
y +! RY. =24 
Tinker ty af] 
* 
<q 3" to RT active site I 
cae 43-nt R2 tag 
NA: 126 127 128 129 130 33 G33C 32 32 32 Sago RNA ent Re tag 
WT WT WT 656 ASU A>U U>A U>A G>CC125G A5G ASC ADU 
R2Bm: - + + + +t ot UH Ut Ut ht Ud lr —> TPRT? 
R2 tag 
43 nt + homology -_ = R2 CMV GFP 
Led P ad RNA: 3/UTR promoter 
i) R2tagz3 - - - - + + + 
— ew wee me me wr wr ,28 wry 
= = = aoa = = aacalinened RT Ga R2bm: - + + + + + + 
ee ee af 
ry J 
@ =Cy5 (bottom strand) = _~ y 
3'UTR RNA — _ -_ = — 
bd ~ ~~ (248 nt) 
- DNA target 
— eee eee ee ee es | —mini-R2 RNA a ae 
(43 nt) 
toluidine blue (RNA/DNA) @ = Cy5 (bottom strand) 
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diagram of the 3'UTR RNA, based on (26). Thicker strokes represent nucleotides labeled 211-bp 28S DNA target using various R2 RNAs. Highlighted mutants are in the 


visible in the cryo-EM density. Nucleotides are numbered from the first base 
of the 3'UTR (the base following the stop codon). (B) Structure of the 3'UTR 
RNA core and the R2Bm NTE-1 domain. Dotted lines, hydrogen bonds. 

(C) Low-pass filtered cryo-EM map. (D) Interactions between 3'UTR bases. Dotted 
lines, hydrogen bonds. (E) Secondary structure of the R2 tag RNA. Unshaded bases 


target DNA nicking (Fig. 2C). Finally, we found 
that the distance between the RUM and the 
bottom-strand cleavage site (the RASIN) is im- 
portant. Increasing the distance by one base 
was tolerated, but further increase or any de- 
crease to the distance inhibited target cleavage 
(Fig. 21). 


Target DNA interactions at the cleavage and 
integration site 


The second key region for DNA target recog- 
nition by R2Bm is the target site for nicking 
by the RLE domain and R2 insertion, which 
we term the RASIN. In our structure, the 11 bp 
of the RASIN from -6 to +5 are melted around 
the RLE domain. The ZnF appears to act as 
the “zip,” stacking on the last upstream pair 
C-G(-7) with Arg®”” and Arg®”* and holding 
unzipped strands apart (Fig. 3A). Strand melt- 
ing may be enhanced by the 40° bend in target 
DNA around the RUM (Fig. 1F). Bases -6 to -1 
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on the bottom strand then follow a cleft be- 
tween the ZnF and the RLE, which adopts a 
canonical PD-(D/E)xK-family nuclease fold, 
but with the characteristic Lys’©”° on an o helix 
instead of the usual B strand (Fig. 3B) (24). This 
lysine, along with catalytic residues Asp*”’ and 
Asp’°™, is 4 to 6 A from the scissile phosphate 
of C(-1), suggesting that C(-1) may be close to 
its position during catalysis of bottom-strand 
cleavage. On the top strand, bases -6 to +2 all 
make extensive contacts along a cleft between 
the RLE and linker domains, except for A(-4), 
which flips out and contacts C126 of the 3’UTR 
(Fig. 3C). To determine the relative impor- 
tance of the bases in the RASIN, we mutated 
each of the 11 bp individually and tested the 
effect on bottom-strand cleavage. Mutating 
T(+1) to A abolished cleavage entirely, and 
mutating T(-6), T(-5), and A(-3) severely de- 
creased activity, whereas other changes were 
tolerated (Fig. 3D). This suggests the following 
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J1/2 region. The same gel was visualized by Cy5 fluorescence and toluidine blue 
staining. (G) The R2-tag allows TPRT of cargo RNAs. Denaturing gel shows TPRT 
reactions with equimolar amounts of the indicated RNAs and a labeled 211-bp 28S DNA 
target. R2 tag (43 nt) was added to the 3’ end of a 239-nt RNA encoding the CMV 
promoter or a 764-nt RNA encoding GFP. 


RASIN motif for cleavage, given in top-strand 
sense: T_sTNANNT,;. 

Because only the bottom strand of the RASIN 
enters the RLE active site, we tested the ac- 
tivity of R2Bm on a single-stranded DNA with 
the bottom-strand sequence and found that it 
was cut, albeit weakly (Fig. 3E). Endonuclease 
activity was strongly stimulated by providing a 
60-nt top strand spanning the RASIN and up- 
stream and downstream sequences, but was 
similarly stimulated by a 32-nt top strand com- 
plementary only to the upstream region contain- 
ing the RUM. A 17-nt top strand complementary 
to the downstream sequence did not stimulate 
activity (Fig. 3E). This suggests that the RUM 
in a double-stranded state is important for re- 
cruiting the R2Bm RLE to the RASIN bottom 
strand and that the top strand of the RASIN, 
despite its extensive interaction with R2Bm, 
is dispensable for specific bottom-strand cleav- 
age. However, when we added deoxynucleotides 
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RUM Myb 
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pivot around 
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top strand 
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reverse transcription 
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Fig. 5. Mechanism and retargeting of first-strand synthesis by R2Bm. SpCas9(H840A) were added in trans, or in cis connected by a 33XTEN linker (fusion 


(A) Model for the initial stages of target site cleavage and first-strand synthesis. 

(B) Design of R2Bm + Cas9 experiments. (C) Complementation of DNA target site 
mutants by Cas9 cleavage in trans and cis. The denaturing gel shows in vitro TPRT 
reactions on a labeled 211-bp target corresponding to the wild-type 28S target, or two 
235-bp targets: one where the RASIN TAAGGTA is replaced by 31 bp of unrelated 
sequence, and another where the 13-bp RUM is additionally scrambled. R2Bm and 


indicated by a shaded box). The sgRNA is complementary to the inserted sequence 
and nicks 40 nt from the last RUM base. The R2 RNA is the 3'UTR with 5 nt of 

3' homology to the nick site. (D) Sequences used for retargeting R2Bm to an unrelated 
locus from the Drosophila virilis genome. (E) Denaturing gel of in vitro TPRT reactions 
on the labeled 192-bp Drosophila virilis target. sgRNAs are numbered as in (D); all R2 
RNAs or R2-tagged RNAs have 10 nt of 3’ homology to the nick site of the sgRNA. 


to these reactions, TPRT activity was eliminated 
in the absence of the top strand from the RASIN 
downstream but was partially rescued if the 
3'UTR RNA contained 3’ homology to the tar- 
get site (Fig. 3E). The top-strand RASIN bases 
A(-4), A(-3), and G(-2) are grasped by Arg?”! 
and Asp°” of the R2Bm linker (Fig. 3C). We 
mutated these two residues to alanine and 
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tested TPRT activity on a fully double-stranded 
substrate, and found that TPRT activity was 
reduced and partially rescued by 3’ homology 
(Fig. 3E). These results suggest two important 
factors for initiating TPRT when the 3’UTR 
RNA lacks 3’ homology: (i) the presence of a 
top strand downstream of the RASIN, which 
may help retain the nicked bottom strand, and 
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Gi) contacts between R2Bm and the top-strand 
RASIN, which help the nicked bottom strand 
“pivot” into the RT active site. 


R2Bm binds a small core region of the 3'UTR 


R2Bm can only initiate TPRT on RNAs contain- 
ing the R2 3'UTR (self-specificity), but the mo- 
lecular basis for this is not known (25). Multiple 


6 of 8 


RESEARCH | RESEARCH ARTICLE 


models have been proposed for the secondary 
structure of the R2 3’UTR, and the divergent 
sequences of R2 RNAs have hindered identi- 
fication of key bases (26, 27). A model for the 
R2 3'UTR secondary structure based on chem- 
ical probing is shown in Fig. 4A and has at 
least 11 stems (26). In our cryo-EM map, we 
resolved density for two stems and their flank- 
ing single-stranded regions (Fig. 4B). On the 
basis of nomenclature commonly used for struc- 
tured RNAs, we name these stems P1 (nu- 
cleotides 33 to 38 and 120 to 135) and P2 
(nucleotides 131 to 137 and 236 to 242), and 
term the single-stranded junction between P1 
and P2 as J1/2 and the single-stranded region 
preceding P1 as JO/1. The rest of the 3'‘UTR 
may occupy a diffuse cloud of cryo-EM density 
next to these core regions (Fig. 4). 

Pl and J1/2 are mainly recognized by an a 
helix from the R2Bm NTE-1 domain, which 
packs into the major groove of P1 and is wrap- 
ped by J1/2 (Fig. 4B). Arg®°” recognizes the 
Hoogsteen edge of P1 G33, and the interaction 
is secured by Arg®”° and Arg?". Consistently, 
these residues were previously shown to be 
essential for RNA binding (5), and the first 
45 bases of the 3’UTR are essential for TPRT 
activity (11). J1/2 makes numerous sequence- 
specific contacts (Fig. 4D): A127 forms a sugar- 
edge pair with the Watson-Crick face of JO/1 
A32, A128 hydrogen bonds to Leu” and Lys” 
of the R2Bm thumb domain and stacks on 
NTE-1 Tyr®™“, U129 hydrogen bonds to Glu?” 
and Lys®”? of NTE-1, and C126 stacks on and 
hydrogen bonds with A(-4) from the top strand 
of the DNA target (Fig. 4, B and D). 

To test if regions of the R2 3’UTR not clearly 
visible in the cryo-EM density are required for 
TPRT activity, we designed a 43-nt minimal 
3'UTR—“R2 tag”—that contains only the se- 
quences visible in the cryo-EM density, linked 
by tetraloops (Fig. 4E). The R2 tag was reverse 
transcribed as efficiently as the full 248-nt 
3'UTRin a TPRT reaction. We tested the impor- 
tance of the J1/2 linker by making single-base 
transversions and found that A127U reduced 
activity and A128U almost completely abol- 
ished TPRT activity (Fig. 4F). Mutating G33 to 
C to disrupt base pairing at the bottom of stem 
P1 also reduced activity but could be rescued 
by the compensatory C125G mutation (Fig. 4F). 
Mutation of JO/1 A32 to G reduced activity, but 
mutations to C or U were tolerated. Equiv- 
alents to P1, P2, JO/1, and J1/2 can be identi- 
fied in the secondary structures of diverse 
R2 elements (26) (fig. S7). The P1 and P2 
stems have different sizes and base compo- 
sitions, but positions 2 and 3 of J1/2, cor- 
responding to A127 and A128, are conserved as 
adenosines, consistent with their importance 
for TPRT. 

Because the R2 tag alone is efficiently in- 
tegrated in a TPRT reaction, we tested if ad- 
ding the R2 tag to the 3’ end of a “cargo” RNA 
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would allow its integration at the 28S target 
site. We added the R2 tag to the 3’ end of a 
239-nt cytomegalovirus (CMV) promoter RNA. 
This tagged RNA was used as efficiently as 
wild-type R2 3'UTR in a TPRT reaction, whereas 
an untagged RNA was not used, nor was an 
RNA tagged with an R2-tag A128U mutant 
(Fig. 4G). A larger RNA containing the 720-nt 
coding sequence for green fluorescent protein 
(GFP) and a 3’ R2 tag was also reverse tran- 
scribed in a TPRT reaction (Fig. 4G). 


R2Bm can be retargeted with CRISPR-Cas9 


Our structural and biochemical observations 
suggest a multistep model for initiation of 
TPRT: The R2Bm N-terminal domains first 
detect a RUM sequence, followed by cleav- 
age of the bottom strand at the RASIN site, 
possible pivoting of the nick around the top 
strand into the RT active site, annealing of 
any 3’ homology to the nicked bottom strand, 
and finally initiation of reverse transcription 
(Fig. 5A). This model implies that R2Bm could 
prime reverse transcription off an exogenously 
nicked bottom strand close to the R2Bm bind- 
ing site (Fig. 5B). To test this, we replaced the 
RASIN and downstream sequences of the 28S 
DNA target with an unrelated sequence con- 
taining an efficient SpCas9 target sequence, 
but kept the RUM sequence to anchor R2Bm 
(Fig. 5B). This substrate could not be cleaved 
by R2Bm but was nicked efficiently by an SpCas9 
H840A nickase mutant (Fig. 5C). When SpCas9 
and R2Bm were added together with a single- 
guide RNA (sgRNA) and an R2 3‘UTR RNA 
with 5 nt of 3’ homology to the sgRNA nick 
site, we detected low amounts of TPRT activ- 
ity. This activity was enhanced when the R2Bm 
and SpCas9 proteins were fused with a 33XTEN 
flexible linker (Fig. 5C). The RUM was not re- 
quired for Cas9-directed TPRT, as mutating the 
RUM did not reduce activity (Fig. 5C). This 
suggests that Cas9 might be able to direct 
R2Bm to perform TPRT at loci other than the 
28S target. We mixed the R2Bm-Cas9(H840A) 
fusion protein with a 192-bp target sequence 
from Drosophila virilis, various sgRNAs, and 
R2 3'UTRs with 10 nt of 3’ homology to the 
nick site dictated by the sgRNA (Fig. 5D). We 
found TPRT activity at all Cas9 nick sites, with 
one sgRNA (guide 2) giving efficient activity 
(Fig. 5E). Adding R2Bm and SpCas9(H840A) 
as separate polypeptides also yielded efficient 
TPRT with guide 2 but was less robust with 
other guides (fig. S9). The 239-nt CMV promo- 
ter RNA with a 3’ R2 tag and 10 nt of homo- 
logy to the guide 2 nick site was also reverse 
transcribed efficiently; this activity required 
the R2 tag and was reduced in the absence of 
3' homology or with the R2 tag A128U muta- 
tion (Fig. 5E). Larger RNAs such as GFP could 
also be reverse transcribed at the guide 2 nick 
site (fig. S9). In summary, R2Bm can be retar- 
geted by Cas9 to perform TPRT at unrelated 
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loci, and the R2 tag can direct incorporation of 
cargo RNAs at these sites. 


Discussion 


Here we show the structure of a non-LTR ret- 
rotransposon during transposition, and we dis- 
sect the principles of target DNA and self-RNA 
recognition. Our structure suggests that R2Bm 
uses its N-ZnF and Myb domains to locate the 
endonuclease target sequence, a model that 
contrasts with the model for other non-LTR 
retrotransposons in which the endonuclease 
domain is the only determinant of target site 
selection (28, 29). We identified two essential 
target site motifs—the RUM and RASIN—that 
are recognized by R2Bm, but we note that 
searching the B. mori genome with a RUM- 
RASIN consensus motif yields many potential 
off-target sites outside of the ribosomal DNA 
arrays (fig. S10). We examined the sequence of 
a previously identified B. mori non-28S inser- 
tion in (30) and found that the target site had 
limited similarity with 28S but had a TTAAcG|T 
RASIN motif (“|” indicates insertion site, low- 
ercase denotes deviation from 28S) and a 
GCTACTTGCGCAT RUM the correct distance 
upstream of the RASIN (fig. S10). Non-28S in- 
sertions, however, are rare, so it is likely that 
other factors are important in regulating R2Bm 
transposition, including chromatin accessibil- 
ity, other sequence motifs, or the ability of the 
target DNA to bend and melt. 

Non-LTR retrotransposons form a diverse 
family, and even within the R2 superclade 
there are notable differences between elements. 
R2Bm is a representative of the R2-D clade of 
elements, which have a single C2H2 N-terminal 
ZnF domain, but R2-A clade elements have 
three tandem N-terminal ZnF domains (37) 
that may create a more extensive DNA binding 
interface with greater stringency in target site 
selection. More broadly, non-LTR retrotrans- 
posons can be divided into two types on the 
basis of their endonuclease domains: those 
that, like R2Bm, use a C-terminal restriction 
enzyme-like (RLE) domain, and those that, like 
human LINE-1, use an unrelated N-terminal 
apurinic or apyrimidinic endonuclease (APE) 
domain (32, 33). Structure prediction using 
AlphaFold (34) suggests that, in these retro- 
transposons, the position of the APE domain 
is distinct from that of the RLE domain in 
R2Bm, suggesting that there may be mecha- 
nistic differences in how target cleavage is 
coupled to reverse transcription (fig. S5) (35). 
Nonetheless, the similarity between the DNA 
interface on the R2Bm thumb domain and the 
corresponding interface in the group IIC in- 
tron (fig. S5) suggests that this interface might 
be conserved among most non-LTR retrotrans- 
posons (79). Indeed, the upstream DNA from 
R2Bm was easily modeled into an AlphaFold 
model of human LINE-1 ORF2, including not 
only the thumb interactions but also strand 
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separation by the CCHC ZnF domain, which in 
LINE-1 ORF2 corresponds to the C-terminal 
cysteine-rich domain (fig. S5). 

Overall, the results of this work advance our 
understanding of transposition by non-LTR 
retrotransposons and suggest avenues for en- 
gineering these transposons for targeted gene 
insertions. 
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SEXUAL SELECTION 


Female preference for rare males is maintained by 
indirect selection in Trinidadian guppies 


Tomos Potter'*+, Jeff Arendt”+, Ronald D. Bassar°, Beth Watson‘, Paul Bentzen‘, 


Joseph Travis', David N. Reznick? 


When females prefer mates with rare phenotypes, sexual selection can maintain rather than deplete 
genetic variation. However, there is no consensus on why this widespread and frequently observed 
preference might evolve and persist. We examine the fitness consequences of female preference for rare 
male color patterns in a natural population of Trinidadian guppies, using a pedigree that spans 10 
generations. We demonstrate (i) a rare male reproductive advantage, (ii) that females that mate with 
rare males gain an indirect fitness advantage through the mating success of their sons, and (iii) the 
fitness benefit that females accrue through their “sexy sons” evaporates for their grandsons as their 
phenotype becomes common. Counter to prevailing theory, we show that female preference can be 


maintained through indirect selection. 


hether female preference for rare males 

can sustain genetic polymorphisms in 

nature has long been controversial (2). 

Rarity, as an attractive trait, compli- 

cates sexual selection theory for the 
evolution of female preference because it in- 
troduces negative frequency-dependence. Nega- 
tive frequency-dependent selection occurs when 
the fitness of a trait decreases as it becomes 
more common and increases as it becomes 
rarer. In the absence of negative frequency- 
dependence, female preference for certain 
male traits can be explained if the sons of at- 
tractive males are also attractive (2, 3) or if 
paternal attractiveness correlates with enhanced 
viability in offspring (4, 5). However, when 
attractiveness is frequency-dependent, as is 
the case when rare male phenotypes have an 
advantage, sons can become victims of their 
father’s success. Specifically, the progeny of 
successful rare males are doomed to become 
common and thus unattractive. Although there 
is robust theory describing conditions under 
which female preference for rare males may 
evolve (6), it is unclear whether the costs of 
such a preference will outweigh its benefits (7). 
Numerous laboratory studies have demon- 
strated a rare-male mating advantage in sev- 
eral taxa [reviewed in (8, 9)]. However, we 
know of only five such studies in nature (10-14). 
As is common in laboratory studies, all studies 
in nature except one (72) reduced the options 
for female choice to just two male morphs 
(albeit in some cases noting variation within 
morphs) (11, 14). These studies were further 
limited to a single mating season, precluding 
the detection of long-term fitness consequences 
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for females that mate with rare males. As a 
result, although these studies have docu- 
mented rare-male advantage and demon- 
strated its proximate mechanism through 
female preference behaviors, the ultimate, 
evolutionary explanation of why females pre- 
fer to mate with rare males remains unknown. 
We show an advantage for rare color pat- 
terns in males under natural conditions in the 
highly polymorphic Trinidadian guppy (Poecilia 
reticulata) and quantify the fitness conse- 
quences of this advantage over multiple gene- 
rations. Male color patterns in guppies are 
reliably transmitted from father to son (15-18). 
We identified 27 distinct color patterns in our 
study population and have confirmed that all 
males within a patriline share the same pat- 
tern (Fig. 1) (19). During courtship, males dis- 
play their pattern to potential mates (20). 
Numerous laboratory studies (15, 20-30) and 
one field manipulation (37) have demonstra- 
ted female preference for rare or unfamiliar 
male patterns. Whether those results apply to 
the much wider level of unmanipulated varia- 
tion in wild male color patterns is unknown. 
We tested these ideas as part of an experi- 
mental study of evolution in a natural stream 
in Trinidad (79, 32, 33). We used monthly mark- 
recapture data to determine the presence and 
movement patterns of individuals within the 
population over this period. We used a micro- 
satellite-based pedigree to determine the rela- 
tedness of individuals and the reproductive 
success for each individual every month over 
this period (19). Our dataset includes monthly 
observations of 7173 individuals spanning 10 
generations (34). We used generalized linear 
mixed effects models (GLMMs) to test the ef- 
fects of male pattern rarity and novelty (defined 
below) on components of fitness in guppies. 


Measuring rarity and novelty 


We assigned a “rarity” score monthly to each 
male. We calculated the rarity of a focal pat- 


{ 

| 6 
tern (7;) as a function of the total numbe Chee 
individuals with that pattern (7;), the jee 
number of individuals of all patterns (V,,), and 


the degree of polymorphism, i.e., the number 
of patterns (P): 


Tr =In (=?) 
Np 


Weighting the relative frequency of patterns 
(™ /y,) byP allows meaningful comparison across 
localities with different degrees of polymor- 
phism. Another useful aspect of this approach 
is that log-transformation results in rare pat- 
terns having negative values, common patterns 
having positive values, and patterns that are 
neither rare nor common ("/y,, = 1/p) having 
a value of zero. This makes linearized model 
coefficients directly interpretable. To illustrate 
our results, we define a “rare” male as one with 
a pattern half as frequent as expected given 
the total degree of polymorphism [i.e., 7; = 
In(0.5)], and a “common” male as one with a 
pattern twice as frequent as expected [i.e., 
r; = In(2)]. These illustrative values fall well 
within the observed distribution of male pat- 
tern rarity (fig. S1). 

A female’s assessment of male rarity will de- 
pend upon the males that she regularly en- 
counters, which may be a spatial subset of the 
total population. The stream habitat is sub- 
divided into discrete pools connected by riffles. 
Our spatially explicit mark-recapture censuses 
allowed us to reconstruct patterns of move- 
ment and make inferences about population 
structure. To assess the possibility that female 
mate preference is shaped by that structure, 
we calculated rarity at three spatial scales: the 
local level (e.g., the pool or riffle where the fish 
was caught that month), the neighborhood, 
and the whole population. Neighborhoods 
were defined using network analysis of move- 
ment of male guppies between pools (19). We 
identified four distinct multipool neighbor- 
hoods characterized by high movement within 
but low movement between. Males moved 
around considerably (62% were new arrivals 
to pools each month, 17% were new arrivals 
to neighborhoods, fig. S2), whereas females 
moved around much less (28% in pools, 4% 
in neighborhoods, fig. $2). When a female as- 
sesses how rare a male is, she is likely to see all 
those males that we collected in the pool with 
her that month. Although we did not observe 
this directly, our neighborhood-level analyses 
include males likely to have passed through the 
pool in the previous month. 

In addition to an advantage to rarity, several 
studies have demonstrated the advantage of 
novelty in the form of female preference for 
unfamiliar males (15, 22, 23, 26, 35), regard- 
less of their color pattern. Female guppies may 
identify novel males through olfactory cues 
(36). As such, males that are new arrivals ina 
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Father 


Son 


Grandson 


Fig. 1. Male color patterns in guppies are reliably transmitted from father to son. Here, we show three generations (father, son, and grandson) for five example 
patrilines showing consistency in color pattern within a Y-lineage. Numbers to the left indicate which lineage of 27 are being shown. Some elastomer marks, used 
to identify individuals, are clearly visible. For example, the son of lineage 14 has a red mark on the dorsal side of the caudal peduncle. 


pool or neighborhood may experience a re- 
productive advantage. To test this, we defined 
males as “novel” if they were new arrivals to 
the pool or neighborhood in which they were 
caught that month. By this definition, males 
cease to be novel one month after arriving in 
a locality. 


Results 


We found evidence for negative frequency- 
dependent selection operating on male color 
patterns, resulting in a rare-male advantage 
(Fig. 2 and table S1). These effects were sig- 
nificant over all three spatial scales over which 
male rarity was calculated but were strongest 
and weakest at the neighborhood and local levels, 
respectively, as determined by Akaike informa- 
tion criterion (AIC) scores (table $1). Each month, 
males with rarer patterns at the neighborhood 
level had 36% more mating partners (GLMM, 
n = 6248, P-value = 2.43 x 10~°) and ultimate- 
ly sired 38% more offspring that recruited into 
the population (GLMM, n = 6248, P-value = 
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4.75 x 107°) (Fig. 2). These results, observed 
over multiple generations, provide strong evi- 
dence that negative frequency-dependent sex- 
ual selection is occurring through rare-male 
advantage. 

Novel males (new arrivals to a pool or 
neighborhood, regardless of their color pat- 
tern) also had a large reproductive advantage 
over residents (Fig. 2 and table S1), with the 
effect strongest at the local level. Compared 
with residents of equivalent rarity, new arri- 
vals to pools had 45% more mating partners 
(GLMM, n = 6248, P-value = 2.77 x 10™*) and 
sired 50% more offspring each month (GLMM, 
n = 6248, P-value = 3.30 x 10). 

In line with earlier studies (15, 20-30), our 
results indicate that female guppies prefer to 
mate with rare color-patterned and/or unfam- 
iliar males. One potential explanation for this 
preference is inbreeding avoidance: rare or 
novel males may be less likely to be kin (37). 
However, we found no evidence that male 
rarity was associated with the relatedness of 


mating partners (linear model, n = 1259, 
P-value = 0.273) (19). Surprisingly, resident 
males were more likely to be unrelated to their 
partners than novel males who were new ar- 
rivals to pools [P(unrelated|resident = 0.28), 
P(unrelated|novel = 0.15), logistic regression, 
n = 1580, P-value = 5.74 x 107*)]. This could 
occur if males are more likely to remain in 
pools with unrelated females. Nevertheless, 
the higher relatedness of novel (and thus at- 
tractive) males and the absence of any as- 
sociation of relatedness with rarity contradict 
the inbreeding-avoidance hypothesis (table S2). 

We found no direct benefit for females’ pre- 
ference for rarity or novelty. Mating with rare 
males did not result in more recruited off- 
spring (table S3 and Fig. 2C, GLMM, n = 2290, 
P-value = 0.448), nor did mating with new 
arrivals to neighborhoods (P-value = 0.969) 
or pools (P-value = 0.397). Our measure of 
recruited offspring refers to those that sur- 
vived to be large enough to be individually 
marked (~2 months old) (19), meaning that 
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Number of partners 
Recruited offspring 


nD 


Offspring per mating 
& 
Grand-offspring per mating 


° 


— newarrival —— resident 


Fig. 2. Rare and novel males have higher 
reproductive fitness, and females that mate 
with rare males have more grand-offspring. 
Effects of male pattern rarity (r;) and novelty (new 
arrival or resident) on components of fitness. 
(A) Number of mating partners per month (for 
males); (B) monthly number of offspring 
recruited into the population (for males); 

(C) number of recruited offspring per mating 
(for males and females); (D) number of grand- 
offspring that ultimately recruit into the 
population from a single mating (for males and 
females). Values for “rare” [rj = |In(0.5)] and 
“common” [r; = In(2)] males are indicated 
with dotted lines annotated r and c, respectively; 
the dashed line indicates r; = 0. Shaded areas 
are 95% confidence intervals, N.S. indicates 
that the slope is not significant (P > 0.05). 
Predictions are based on models where rarity 
was calculated at the neighborhood level (A) 
and (B) or the population level (C) and (D). 


variation in offspring viability is captured in 
this metric. This indicates that preference for 
rare or novel males is not under direct selec- 
tion through mechanisms that enhance off- 
spring viability, such as inbreeding avoidance 
or so-called “good genes” (4, 5). 

What then is the ultimate benefit of mating 
with rare or novel males? Although we de- 
tected no indirect fitness benefits for females 
that mated with novel males [i.e., that were 
new arrivals to neighborhoods (GLMM, n = 
1951, P-value = 0.439) or pools (P-value = 
0.573)], matings with rare males (at the pop- 
ulation level) ultimately resulted in 48% more 
grand-offspring recruited into the population 
than matings with common males (P-value = 
3.42 x 10~“*). This is a substantial indirect fit- 
ness benefit for those females (table S3 and 
Fig. 2D). 

Females that mated with rare males gained 
this indirect fitness advantage through the en- 
hanced reproductive success of their sons: a 
so-called “sexy sons” effect [in the sense of 
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100% 


69% 38% 


Generation 


Fig. 3. Rare males have rare sons but common 
grandsons. |n this figure, we track the trajectory 
of male pattern rarity (rj, at the population level, 

xX axis) over three generations (y axis), focusing on 
rare males [fl, r; < In(0.5)], their sons (f2), and 
grandsons (f3). Lines connect male to sons to 
grandsons. Percentages describe individuals in each 
generation where r; < 0, i.e., that are rarer than 
expected. 


Kokko (6)]. The sons of rare males (at the 
population level) sired more offspring per 
month (table S4, GLMM, n = 9807, P-value = 
0.0039), but there was no such effect in 
daughters (P-value = 0.575). This occurred 
because the sons of rare males were also rare 
(albeit less so than their fathers) and thus still 
attractive (Fig. 3). This advantage was short- 
lived, however; after two generations of rare- 
male advantage the grandsons of rare males 
became victims of their forefathers’ success 
and were more likely to be common (Fig. 3). 
Consequently, for males with equivalently rare 
fathers, having a rare grandfather reduced re- 
productive success (P = 0.0082). 


Discussion 


Female guppies that mated with rare males 
gained no direct fitness advantage in doing 
so. Their offspring did not have increased via- 
bility due to any genetic advantage of their 
attractive fathers, nor were they less inbred. 
Instead, females that mated with rare males 
derived substantial indirect fitness through 
the attractiveness of their sons. This is at odds 
with the long-held prediction that such in- 
direct selection cannot maintain female pref- 
erence (7, 38-40). This prediction is not an 
ineluctable consequence of theory. It is based 
on assumptions about the genetic variances 
and covariances of female preference and 
male traits, which imply that indirect selection 
must always be overwhelmed by the direct 
costs of female choice (7). We suggest that 
this may not hold true when the desirable 
trait is rarity and physical traits are arbitrary. 
Our study shows that female preference can be 
maintained by indirect selection when nega- 
tive frequency-dependence is operating. 

Our findings offer a resolution to the “lek 
paradox” (39). To maintain female preferences 


through indirect selection, there must be a 
sustained supply of genetic variation in male 
traits (39, 41). The crux of the lek paradox is 
that selection on male traits will erode that 
genetic variation, ultimately resulting in the 
loss of female preference (39). However, when 
females prefer rare males—regardless of male 
genotype—negative frequency-dependent se- 
lection occurs, ensuring the necessary mainte- 
nance of genetic variation. 

A notable result is the absence of any de- 
tectable fitness benefit, direct or indirect, for 
females that mated with novel males. Novel 
males (new arrivals to pools or neighborhoods) 
had substantially higher reproductive success 
than residents, regardless of the rarity of their 
color pattern. Our results illustrate that mat- 
ing with rare and novel males has distinct fit- 
ness consequences for females: Rare males 
conferred a single-generation reproductive ad- 
vantage to their sons, driving indirect selec- 
tion for female preference, whereas novel males 
conferred no fitness advantages to their part- 
ners, either directly or through their offspring. 
Why then do females prefer novel males? 

One possibility is that female preferences 
for rare and novel males stem from a single, 
simple mechanism: habituation to familiar 
males, i.e., females preferring males that are 
unlike those they have recently encountered 
(21, 22, 27). Males with rare color patterns or 
that are new arrivals to pools (i.e., novel) are 
both likely to fit this criterion. In this scenario, 
selection for choosy females is driven by the 
indirect fitness advantage gained when they 
mate with rare males. By contrast, preference 
for novel males emerges as a nonadaptive by- 
product of the simple behavioral mechanism 
under selection. 

Female choosiness is likely also under 
frequency-dependent selection (6, 42). Consi- 
der what would happen if all females mated 
with a single male bearing the rarest color 
pattern: the sexy son benefit would be lost 
because all male offspring would have the 
same pattern, making it common and thus 
unattractive. As a result, selection for choosy 
females would evaporate. Although we do not 
know the mean frequency of choosy females in 
our population, theory suggests that it is likely 
to be high. Female preference alleles evolve to 
higher frequencies when the ability to express 
choice is hindered (6). Here, female choice was 
hindered by the different movement patterns 
of males and females. The optimum scale on 
which females should choose rare males is at 
the level of the population: Males frequently 
change location so the rarity of sons, upon 
which the indirect benefits to females depend, 
is best predicted by the rarity of fathers at the 
population level (tables S3 and S4). However, 
females can only assess the rarity of males they 
encounter. The more limited movement of 
females meant that they chose males that were 
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rare at the level of the neighborhood (table S1). 
This mismatch between the optimum and 
realized exercises of choice creates the hin- 
drance that could sustain a high frequency of 
choosy females. 

In conclusion, our results challenge the 
theoretical arguments against the role of sexy 
sons in sexual selection (7, 38-40) by showing 
that this indirect form of selection can sustain 
female preference. At the same time, we show 
that female preference for rare male pheno- 
types resolves the lek paradox. Both results are 
a consequence of negative frequency-dependent 
selection operating on sexual signals and pref- 
erences in guppies. Female preference for rare 
males is well documented in a diversity of or- 
ganisms (8-4), but detecting indirect selection 
in the wild is uncommon because it requires 
multigenerational studies. The replication of 
such studies in other organisms will test the 
generality of our results and determine the 
broader importance of sexual selection in main- 
taining, rather than depleting, genetic variation 
in the wild. 
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WORKING LIFE 


By E. Celeste Welch 


314 


Serious about research—and teaching 


arrived at the “future faculty” workshop hoping to glean tips on how to apply for tenure-track 

jobs. But when the time came to discuss my application materials, I was taken aback by the advice. 

“Cel,” the faculty mentor said hesitantly, “I’m going to give you some harsh advice that I think you 

need to hear. Try to tone down your service and teaching—it doesn’t make you look serious about 

research.” I felt my cheeks burning in embarrassment as some of the other attendees nodded their 

heads in agreement. And I left feeling bewildered. I’'d spent years building up my teaching and 
mentoring skills and devoting myself to serving as a positive role model for students. Shouldn’t that 
hold value in a faculty job search? Why would I be penalized for it? 


During graduate school, I had be- 
come passionate about making engi- 
neering accessible to everyone. I did 
not want others to feel like an impos- 
ter, as I did. As an undergraduate, 
advisers had told me I wasn’t cut 
out for a career in engineering. Con- 
vinced an academic position was off 
the table, I planned to pursue a po- 
sition in industry after graduating. 
I submitted applications to gradu- 
ate school on a whim. I was floored 
when I was accepted. 

When I became a teaching assis- 
tant, I started out badly. I mirrored 
the methods that had been used on 
me for all my life—relying on tra- 
ditionally structured lectures and 
high-stress assignments—and was 
frustrated when students did poorly. 
I expected a lot from them but was 
not sure how to give them the re- 
sources they needed to succeed. 

In my third year, I decided to pursue a teaching certifica- 
tion. I learned that the methods I’d been using had long 
been debunked as ineffective, as they exacerbated perfor- 
mance gaps in students from marginalized backgrounds. I 
remembered that I excelled in courses with interactive lec- 
tures, small group work, low-stakes assignments, and car- 
ing professors who led with empathy. Such approaches, I 
realized, could help other marginalized students succeed. 

I modified my teaching based on what I learned. Later 
that year, after I served as a precollege instructor at another 
university, one student wrote, “I wasn’t sure if I could do 
engineering, but Dr. Welch made everything so simple and 
clear. I realized it’s not that scary after all.” This feedback 
filled me with immense satisfaction. 

I hadn’t forgotten about my research; that was always my 
main focus. But I enjoy multitasking, and teaching also ben- 


“| hope to land in a place that 
values my commitment to teach, 
serve, and care for my students.” 


efited my research. In the lab, I ad- 
justed my mentoring style to match 
teaching approaches that worked 
well—leading with empathy and 
providing individualized support— 
and my team exploded with produc- 
tivity. Seeing evidence of my ability 
to lead a research team convinced 
me that I could excel in a tenure- 
track faculty position. 

When I asked my mentors how 
to ensure I’d be competitive on the 
academic job market, they always 
said the same thing: be excellent 
in research, teaching, and service. 
But in the same breath they would 
remark on my teaching and service 
work, hinting that those who pur- 
sue such activities are best suited 
to serving as teaching faculty mem- 
bers or diversity, equity, and inclu- 
sion officers. 

Then, at the future faculty work- 
shop, it happened again. The advice rattled around my head 
long after. At first, I responded by reducing some of my 
nonresearch commitments. But then a mentee would need 
advice and I could feel the teaching skills I had spent time 
cultivating coming out. I vowed to not push those activi- 
ties to the side to placate nearsighted views about how re- 
searchers on the academic track should allocate their time. 

As I apply to faculty positions, I hope to land in a place 
that values my commitment to teach, serve, and care for 
my students. I’m not hiding my track record—I’m embrac- 
ing it. Because I’ve seen first-hand the difference pro- 
fessors can make when they invest time and energy in 
teaching and mentoring. 


E. Celeste “Cel” Welch is a Ph.D. candidate at Brown University and 
instructor at Columbia University. 
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